dm-mq and end_clone_request()

Message ID	20160801204628.GA94704@redhat.com (mailing list archive)
State	Not Applicable, archived
Headers	show Return-Path: <linux-scsi-owner@kernel.org> Date: Mon, 1 Aug 2016 16:46:28 -0400 From: Mike Snitzer <snitzer@redhat.com> To: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Laurence Oberman <loberman@redhat.com>, "dm-devel@redhat.com" <dm-devel@redhat.com>, "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org> Subject: Re: dm-mq and end_clone_request() Message-ID: <20160801204628.GA94704@redhat.com> References: <6880321d-e14f-169b-d100-6e460dd9bd09@sandisk.com> <1110327939.7305916.1469819453678.JavaMail.zimbra@redhat.com> <a5c1a149-b1a2-b5a4-2207-bdaf32db3cbd@sandisk.com> <757522831.7667712.1470059860543.JavaMail.zimbra@redhat.com> <536022978.7668211.1470060125271.JavaMail.zimbra@redhat.com> <931235537.7668834.1470060339483.JavaMail.zimbra@redhat.com> <1264951811.7684268.1470065187014.JavaMail.zimbra@redhat.com> <17da3ab0-233a-2cec-f921-bfd42c953ccc@sandisk.com> <20160801175948.GA6685@redhat.com> <f9e71e1d-54ca-40fe-a5ea-ae2863c94ff9@sandisk.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <f9e71e1d-54ca-40fe-a5ea-ae2863c94ff9@sandisk.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk

Mike Snitzer Aug. 1, 2016, 8:46 p.m. UTC

On Mon, Aug 01 2016 at  2:55P -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> On 08/01/2016 10:59 AM, Mike Snitzer wrote:
> >This says to me that must_push_back is returning false because
> >dm_noflush_suspending() is false.  When this happens -EIO will escape up
> >the IO stack.
> >
> >And this confirms that must_push_back() calling dm_noflush_suspending()
> >is quite suspect given queue_if_no_path was configured: we should
> >_always_ pushback if no paths are available.
> >
> >I'll dig deeper on really understanding _why_ must_push_back() is coded
> >like it is.
> 
> Hello Mike,
> 
> Earlier I had reported that I observe this behavior with
> CONFIG_DM_MQ_DEFAULT=y after the first simulated cable pull. I have been
> able to reproduce this behavior with CONFIG_DM_MQ_DEFAULT=n but it takes a
> large number of iterations to trigger this behavior. The output that appears
> on my setup in the kernel log with a bunch of printk()'s added in the
> dm-mpath driver for CONFIG_DM_MQ_DEFAULT=n is as follows (mpath 254:0 and
> /dev/mapper/mpathbe refer to the same multipath device):
> 
> [  314.755582] mpath 254:0: queue_if_no_path 0 -> 1
> [  314.770571] executing DM ioctl DEV_SUSPEND on mpathbe
> [  314.770622] mpath 254:0: queue_if_no_path 1 -> 0
> [  314.770657] __multipath_map(): (a) returning -5
> [  314.770657] map_request(): clone_and_map_rq() returned -5
> [  314.770658] dm_complete_request: error = -5

Hi Bart,

Please retry both variant (CONFIG_DM_MQ_DEFAULT=y first) with this patch
applied.  Interested to see if things look better for you (WARN_ON_ONCEs
added just to see if we hit the corresponding suspend/stopped state
while mapping requests -- if so this speaks to an inherently racey
problem that will need further investigation for a proper fix but
results from this should let us know if we're closer).

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bart Van Assche Aug. 1, 2016, 10:41 p.m. UTC | #1

On 08/01/2016 01:46 PM, Mike Snitzer wrote:
> Please retry both variant (CONFIG_DM_MQ_DEFAULT=y first) with this patch
> applied.  Interested to see if things look better for you (WARN_ON_ONCEs
> added just to see if we hit the corresponding suspend/stopped state
> while mapping requests -- if so this speaks to an inherently racy
> problem that will need further investigation for a proper fix but
> results from this should let us know if we're closer).
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 1b2f962..0e0f6e0 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -2007,6 +2007,9 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq,
>  	struct dm_target *ti = tio->ti;
>  	struct request *clone = NULL;
>  
> +	if (WARN_ON_ONCE(unlikely(dm_suspended_md(md))))
> +		return DM_MAPIO_REQUEUE;
> +
>  	if (tio->clone) {
>  		clone = tio->clone;
>  		r = ti->type->map_rq(ti, clone, &tio->info);
> @@ -2722,6 +2725,9 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
>  		dm_put_live_table(md, srcu_idx);
>  	}
>  
> +	if (WARN_ON_ONCE(unlikely(test_bit(BLK_MQ_S_STOPPED, &hctx->state))))
> +		return BLK_MQ_RQ_QUEUE_BUSY;
> +
>  	if (ti->type->busy && ti->type->busy(ti))
>  		return BLK_MQ_RQ_QUEUE_BUSY;

Hello Mike,

The test results with this patch and also the three other patches that
have been posted in the context of this e-mail thread applied on top of
kernel v4.7 are as follows:

(1) CONFIG_DM_MQ_DEFAULT=y and fio running on top of XFS:

From the system log:

[ ... ]
mpath 254:0: queue_if_no_path 0 -> 1
executing DM ioctl DEV_SUSPEND on mpathbe
mpath 254:0: queue_if_no_path 1 -> 0
__multipath_map(): (a) returning -5
map_request(): clone_and_map_rq() returned -5
dm_complete_request: error = -5
dm_softirq_done: dm-0 tio->error = -5
blk_update_request: I/O error (-5), dev dm-0, sector 311960
[ ... ]

After this test finished, "dmsetup remove_all" failed and the following
message appeared in the system log: "device-mapper: ioctl: remove_all
left 1 open device(s)".

Note: when I reran this test after a reboot "dmsetup remove_all" succeeded.


(2) CONFIG_DM_MQ_DEFAULT=y and fio running on top of ext4:

From the system log:
[ ... ]
[  146.023067] WARNING: CPU: 2 PID: 482 at drivers/md/dm.c:2748 dm_mq_queue_rq+0xc1/0x150 [dm_mod]
[  146.026073] Workqueue: kblockd blk_mq_run_work_fn
[  146.026083] Call Trace:
[  146.026087]  [<ffffffff81320047>] dump_stack+0x68/0xa1
[  146.026090]  [<ffffffff81061c46>] __warn+0xc6/0xe0
[  146.026092]  [<ffffffff81061d18>] warn_slowpath_null+0x18/0x20
[  146.026098]  [<ffffffffa0286791>] dm_mq_queue_rq+0xc1/0x150 [dm_mod]
[  146.026100]  [<ffffffff81306f7a>] __blk_mq_run_hw_queue+0x1da/0x350
[  146.026102]  [<ffffffff813076c0>] blk_mq_run_work_fn+0x10/0x20
[  146.026105]  [<ffffffff8107efe9>] process_one_work+0x1f9/0x6a0
[  146.026109]  [<ffffffff8107f4d9>] worker_thread+0x49/0x490
[  146.026116]  [<ffffffff81085cda>] kthread+0xea/0x100
[  146.026119]  [<ffffffff81624fbf>] ret_from_fork+0x1f/0x40
[ ... ]
[  146.269194] mpath 254:1: queue_if_no_path 0 -> 1
[  146.276502] executing DM ioctl DEV_SUSPEND on mpathbf
[  146.276556] mpath 254:1: queue_if_no_path 1 -> 0
[  146.276560] __multipath_map(): (a) returning -5
[  146.276561] map_request(): clone_and_map_rq() returned -5
[  146.276562] dm_complete_request: error = -5
[  146.276563] dm_softirq_done: dm-1 tio->error = -5
[  146.276566] blk_update_request: I/O error (-5), dev dm-1, sector 2097144
[ ... ]

After this test finished running "dmsetup remove_all" and unloading ib_srp
succeeded.


(3) CONFIG_DM_MQ_DEFAULT=n and fio running on top of XFS:

The first run of this test passed. During the second run fio reported
an I/O error. From the system log:

[ ... ]
[ 1290.010886] mpath 254:0: queue_if_no_path 0 -> 1
[ 1290.026905] executing DM ioctl DEV_SUSPEND on mpathbe
[ 1290.026960] mpath 254:0: queue_if_no_path 1 -> 0
[ 1290.027001] __multipath_map(): (a) returning -5
[ 1290.027002] map_request(): clone_and_map_rq() returned -5
[ 1290.027003] dm_complete_request: error = -5
[ ... ]


(4) CONFIG_DM_MQ_DEFAULT=n and fio running on top of ext4:

The first two runs of this test passed. After the second run "dmsetup
remove_all" failed and the following error message appeared in the system
log: "device-mapper: ioctl: remove_all left 1 open device(s)". The following
kernel thread might be the one that was holding open /dev/dm-0:

# ps aux | grep dio/
root      5306  0.0  0.0      0     0 ?        S<   15:24   0:00 [dio/dm-0]


Please let me know if you need more information.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mike Snitzer Aug. 2, 2016, 5:45 p.m. UTC | #2

On Mon, Aug 01 2016 at  6:41pm -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> On 08/01/2016 01:46 PM, Mike Snitzer wrote:
> > Please retry both variant (CONFIG_DM_MQ_DEFAULT=y first) with this patch
> > applied.  Interested to see if things look better for you (WARN_ON_ONCEs
> > added just to see if we hit the corresponding suspend/stopped state
> > while mapping requests -- if so this speaks to an inherently racy
> > problem that will need further investigation for a proper fix but
> > results from this should let us know if we're closer).
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index 1b2f962..0e0f6e0 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -2007,6 +2007,9 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq,
> >  	struct dm_target *ti = tio->ti;
> >  	struct request *clone = NULL;
> >  
> > +	if (WARN_ON_ONCE(unlikely(dm_suspended_md(md))))
> > +		return DM_MAPIO_REQUEUE;
> > +
> >  	if (tio->clone) {
> >  		clone = tio->clone;
> >  		r = ti->type->map_rq(ti, clone, &tio->info);
> > @@ -2722,6 +2725,9 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
> >  		dm_put_live_table(md, srcu_idx);
> >  	}
> >  
> > +	if (WARN_ON_ONCE(unlikely(test_bit(BLK_MQ_S_STOPPED, &hctx->state))))
> > +		return BLK_MQ_RQ_QUEUE_BUSY;
> > +
> >  	if (ti->type->busy && ti->type->busy(ti))
> >  		return BLK_MQ_RQ_QUEUE_BUSY;
> 
> Hello Mike,
> 
> The test results with this patch and also the three other patches that
> have been posted in the context of this e-mail thread applied on top of
> kernel v4.7 are as follows:

Hi Bart,

Please do these same tests against a v4.7 kernel with the 4 patches from
this branch applied (no need for your other debug patches):
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.7-mpath-fixes

I've had good results with my blk-mq SRP based testing.

Thanks,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bart Van Assche Aug. 3, 2016, 12:19 a.m. UTC | #3

On 08/02/2016 10:45 AM, Mike Snitzer wrote:
> Please do these same tests against a v4.7 kernel with the 4 patches from
> this branch applied (no need for your other debug patches):
> https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.7-mpath-fixes
> 
> I've had good results with my blk-mq SRP based testing.

Hello Mike,

Thanks again for having made these patches available. The results of my
tests are as follows:

(1) CONFIG_DM_MQ_DEFAULT=y, fio running on top of xfs

The first simulated cable pull caused the following messages to appear:

[  428.716566] mpath 254:1: queue_if_no_path 1 -> 0
[  428.729671] __multipath_map(): (a) returning -5
[  428.729730] map_request(): clone_and_map_rq() returned -5
[  428.729788] dm_complete_request: error = -5
[  428.729846] dm_softirq_done: dm-1 tio->error = -5
[  428.729904] blk_update_request: 880 callbacks suppressed
[  428.729970] blk_update_request: I/O error (-5), dev dm-1, sector 2097024

(2) CONFIG_DM_MQ_DEFAULT=y, fio running on top of ext4

The first simulated cable pull caused the following messages to appear:

[  162.894737] mpath 254:0: queue_if_no_path 0 -> 1
[  162.903155] executing DM ioctl DEV_SUSPEND on mpathbe
[  162.903207] mpath 254:0: queue_if_no_path 1 -> 0
[  162.903255] device-mapper: multipath: must_push_back: queue_if_no_path=0 suspend_active=1 suspending=0
[  162.903256] __multipath_map(): (a) returning -5
[  162.903257] map_request(): clone_and_map_rq() returned -5
[  162.903258] dm_complete_request: error = -5
[  162.903259] dm_softirq_done: dm-0 tio->error = -5
[  162.903261] blk_update_request: I/O error (-5), dev dm-0, sector 263424
[  162.903284] Buffer I/O error on dev dm-0, logical block 32928, lost sync page write

(3) CONFIG_DM_MQ_DEFAULT=n, fio running on top of xfs

This test passed once but on the second run fio reported "bad magic header" after
a large number of iterations. I'm still analyzing the logs.

(4) CONFIG_DM_MQ_DEFAULT=n, fio running on top of ext4

Ran this test three times. The first two runs passed but during the third run
fio reported again I/O errors and I found the following in the kernel log:

[  954.048860] __multipath_map(): (a) returning -5
[  954.048861] map_request(): clone_and_map_rq() returned -5
[  954.048862] dm_complete_request: error = -5
[  954.048870] dm_softirq_done: dm-0 tio->error = -5
[  954.048873] blk_update_request: 15 callbacks suppressed
[  954.048874] blk_update_request: I/O error (-5), dev dm-0, sector 159976

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mike Snitzer Aug. 3, 2016, 12:40 a.m. UTC | #4

On Tue, Aug 02 2016 at  8:19pm -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> On 08/02/2016 10:45 AM, Mike Snitzer wrote:
> > Please do these same tests against a v4.7 kernel with the 4 patches from
> > this branch applied (no need for your other debug patches):
> > https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.7-mpath-fixes
> > 
> > I've had good results with my blk-mq SRP based testing.
> 
> Hello Mike,
> 
> Thanks again for having made these patches available. The results of my
> tests are as follows:

Disappointing.  But I asked you to run the v4.7 kernel patches I
pointed to _without_ any of your debug patches.

I cannot reproduce on our SRP testbed with the fixes I provided.  We're
now in a place where there would appear to be something very unique to
your environment causing these failures.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Laurence Oberman Aug. 3, 2016, 1:33 a.m. UTC | #5

Hi Bart

I simplified the test to 2 simple scripts and only running against one XFS file system.
Can you validate these and tell me if its enough to emulate what you are doing.
Perhaps our test-suite is too simple.

Start the test

# cat run_test.sh
#!/bin/bash
logger "Starting Bart's test"
#for i in `seq 1 10`
for i in 1
do
	fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \
        --iodepth=64 --group_reporting --sync=1 --direct=1 --ioengine=libaio \
        --directory="/data-$i" --name=data-integrity-test --thread --numjobs=16 \
        --runtime=600 --output=fio-output.txt >/dev/null &
done

Delete the host, I wait 10s in between host deletions. 
But I also tested with 3s and still its stable with Mike's patches.

#!/bin/bash
for i in /sys/class/srp_remote_ports/*
do
 echo "Deleting host $i, it will re-connect via srp_daemon" 
 echo 1 > $i/delete
 sleep 10
done

Check for I/O errors affecting XFS and we now have none with the patches Mike provided.
After recovery I can create files in the xfs mount with no issues.

Can you use my scripts and 1 mount and see if it still fails for you.

Thanks
Laurence

----- Original Message -----
> From: "Mike Snitzer" <snitzer@redhat.com>
> To: "Bart Van Assche" <bart.vanassche@sandisk.com>
> Cc: dm-devel@redhat.com, "Laurence Oberman" <loberman@redhat.com>, linux-scsi@vger.kernel.org
> Sent: Tuesday, August 2, 2016 8:40:14 PM
> Subject: Re: dm-mq and end_clone_request()
> 
> On Tue, Aug 02 2016 at  8:19pm -0400,
> Bart Van Assche <bart.vanassche@sandisk.com> wrote:
> 
> > On 08/02/2016 10:45 AM, Mike Snitzer wrote:
> > > Please do these same tests against a v4.7 kernel with the 4 patches from
> > > this branch applied (no need for your other debug patches):
> > > https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.7-mpath-fixes
> > > 
> > > I've had good results with my blk-mq SRP based testing.
> > 
> > Hello Mike,
> > 
> > Thanks again for having made these patches available. The results of my
> > tests are as follows:
> 
> Disappointing.  But I asked you to run the v4.7 kernel patches I
> pointed to _without_ any of your debug patches.
> 
> I cannot reproduce on our SRP testbed with the fixes I provided.  We're
> now in a place where there would appear to be something very unique to
> your environment causing these failures.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mike Snitzer Aug. 3, 2016, 2:10 a.m. UTC | #6

On Tue, Aug 02 2016 at  9:33pm -0400,
Laurence Oberman <loberman@redhat.com> wrote:

> Hi Bart
> 
> I simplified the test to 2 simple scripts and only running against one XFS file system.
> Can you validate these and tell me if its enough to emulate what you are doing.
> Perhaps our test-suite is too simple.
> 
> Start the test
> 
> # cat run_test.sh
> #!/bin/bash
> logger "Starting Bart's test"
> #for i in `seq 1 10`
> for i in 1
> do
> 	fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \
>         --iodepth=64 --group_reporting --sync=1 --direct=1 --ioengine=libaio \
>         --directory="/data-$i" --name=data-integrity-test --thread --numjobs=16 \
>         --runtime=600 --output=fio-output.txt >/dev/null &
> done
> 
> Delete the host, I wait 10s in between host deletions. 
> But I also tested with 3s and still its stable with Mike's patches.
> 
> #!/bin/bash
> for i in /sys/class/srp_remote_ports/*
> do
>  echo "Deleting host $i, it will re-connect via srp_daemon" 
>  echo 1 > $i/delete
>  sleep 10
> done
> 
> Check for I/O errors affecting XFS and we now have none with the patches Mike provided.
> After recovery I can create files in the xfs mount with no issues.
> 
> Can you use my scripts and 1 mount and see if it still fails for you.

In parallel we can try Bart's testsuite that he shared earlier in this
thread: https://github.com/bvanassche/srp-test

README.md says:
"Running these tests manually is tedious. Hence this test suite that
tests the SRP initiator and target drivers by loading both drivers on
the same server, by logging in using the IB loopback functionality and
by sending I/O through the SRP initiator driver to a RAM disk exported
by the SRP target driver."

This could explain why Bart is still seeing issues.  He isn't testing
real hardware -- as such he is using ramdisk to expose races, etc.

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Laurence Oberman Aug. 3, 2016, 2:18 a.m. UTC | #7

----- Original Message -----
> From: "Mike Snitzer" <snitzer@redhat.com>
> To: "Laurence Oberman" <loberman@redhat.com>
> Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, dm-devel@redhat.com, linux-scsi@vger.kernel.org
> Sent: Tuesday, August 2, 2016 10:10:12 PM
> Subject: Re: dm-mq and end_clone_request()
> 
> On Tue, Aug 02 2016 at  9:33pm -0400,
> Laurence Oberman <loberman@redhat.com> wrote:
> 
> > Hi Bart
> > 
> > I simplified the test to 2 simple scripts and only running against one XFS
> > file system.
> > Can you validate these and tell me if its enough to emulate what you are
> > doing.
> > Perhaps our test-suite is too simple.
> > 
> > Start the test
> > 
> > # cat run_test.sh
> > #!/bin/bash
> > logger "Starting Bart's test"
> > #for i in `seq 1 10`
> > for i in 1
> > do
> > 	fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \
> >         --iodepth=64 --group_reporting --sync=1 --direct=1
> >         --ioengine=libaio \
> >         --directory="/data-$i" --name=data-integrity-test --thread
> >         --numjobs=16 \
> >         --runtime=600 --output=fio-output.txt >/dev/null &
> > done
> > 
> > Delete the host, I wait 10s in between host deletions.
> > But I also tested with 3s and still its stable with Mike's patches.
> > 
> > #!/bin/bash
> > for i in /sys/class/srp_remote_ports/*
> > do
> >  echo "Deleting host $i, it will re-connect via srp_daemon"
> >  echo 1 > $i/delete
> >  sleep 10
> > done
> > 
> > Check for I/O errors affecting XFS and we now have none with the patches
> > Mike provided.
> > After recovery I can create files in the xfs mount with no issues.
> > 
> > Can you use my scripts and 1 mount and see if it still fails for you.
> 
> In parallel we can try Bart's testsuite that he shared earlier in this
> thread: https://github.com/bvanassche/srp-test
> 
> README.md says:
> "Running these tests manually is tedious. Hence this test suite that
> tests the SRP initiator and target drivers by loading both drivers on
> the same server, by logging in using the IB loopback functionality and
> by sending I/O through the SRP initiator driver to a RAM disk exported
> by the SRP target driver."
> 
> This could explain why Bart is still seeing issues.  He isn't testing
> real hardware -- as such he is using ramdisk to expose races, etc.
> 
> Mike
> 

Hi Mike,

I looked at Bart's scripts, they looked fine but I wanted a more simplified way to bring the error out.
Using ramdisk is not uncommon as an LIO backend via ib_srpt to serve LUNS.
That is the same way I do it when I am not connected to a large array as it is the only way I can get EDR like speeds.

I don't thinks its racing due to the ramdisk back-end but  maybe we need to ramp ours up to run more in parallel in a loop.

I will run 21 parallel runs and see if it makes a difference tonight and report back tomorrow.
Clearly prior to your final patches we were escaping back to the FS layer with errors but since your patches, at least in out test harness that is resolved.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Laurence Oberman Aug. 3, 2016, 2:55 a.m. UTC | #8

----- Original Message -----
> From: "Laurence Oberman" <loberman@redhat.com>
> To: "Mike Snitzer" <snitzer@redhat.com>
> Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, dm-devel@redhat.com, linux-scsi@vger.kernel.org
> Sent: Tuesday, August 2, 2016 10:18:30 PM
> Subject: Re: dm-mq and end_clone_request()
> 
> 
> 
> ----- Original Message -----
> > From: "Mike Snitzer" <snitzer@redhat.com>
> > To: "Laurence Oberman" <loberman@redhat.com>
> > Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, dm-devel@redhat.com,
> > linux-scsi@vger.kernel.org
> > Sent: Tuesday, August 2, 2016 10:10:12 PM
> > Subject: Re: dm-mq and end_clone_request()
> > 
> > On Tue, Aug 02 2016 at  9:33pm -0400,
> > Laurence Oberman <loberman@redhat.com> wrote:
> > 
> > > Hi Bart
> > > 
> > > I simplified the test to 2 simple scripts and only running against one
> > > XFS
> > > file system.
> > > Can you validate these and tell me if its enough to emulate what you are
> > > doing.
> > > Perhaps our test-suite is too simple.
> > > 
> > > Start the test
> > > 
> > > # cat run_test.sh
> > > #!/bin/bash
> > > logger "Starting Bart's test"
> > > #for i in `seq 1 10`
> > > for i in 1
> > > do
> > > 	fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \
> > >         --iodepth=64 --group_reporting --sync=1 --direct=1
> > >         --ioengine=libaio \
> > >         --directory="/data-$i" --name=data-integrity-test --thread
> > >         --numjobs=16 \
> > >         --runtime=600 --output=fio-output.txt >/dev/null &
> > > done
> > > 
> > > Delete the host, I wait 10s in between host deletions.
> > > But I also tested with 3s and still its stable with Mike's patches.
> > > 
> > > #!/bin/bash
> > > for i in /sys/class/srp_remote_ports/*
> > > do
> > >  echo "Deleting host $i, it will re-connect via srp_daemon"
> > >  echo 1 > $i/delete
> > >  sleep 10
> > > done
> > > 
> > > Check for I/O errors affecting XFS and we now have none with the patches
> > > Mike provided.
> > > After recovery I can create files in the xfs mount with no issues.
> > > 
> > > Can you use my scripts and 1 mount and see if it still fails for you.
> > 
> > In parallel we can try Bart's testsuite that he shared earlier in this
> > thread: https://github.com/bvanassche/srp-test
> > 
> > README.md says:
> > "Running these tests manually is tedious. Hence this test suite that
> > tests the SRP initiator and target drivers by loading both drivers on
> > the same server, by logging in using the IB loopback functionality and
> > by sending I/O through the SRP initiator driver to a RAM disk exported
> > by the SRP target driver."
> > 
> > This could explain why Bart is still seeing issues.  He isn't testing
> > real hardware -- as such he is using ramdisk to expose races, etc.
> > 
> > Mike
> > 
> 
> Hi Mike,
> 
> I looked at Bart's scripts, they looked fine but I wanted a more simplified
> way to bring the error out.
> Using ramdisk is not uncommon as an LIO backend via ib_srpt to serve LUNS.
> That is the same way I do it when I am not connected to a large array as it
> is the only way I can get EDR like speeds.
> 
> I don't thinks its racing due to the ramdisk back-end but  maybe we need to
> ramp ours up to run more in parallel in a loop.
> 
> I will run 21 parallel runs and see if it makes a difference tonight and
> report back tomorrow.
> Clearly prior to your final patches we were escaping back to the FS layer
> with errors but since your patches, at least in out test harness that is
> resolved.
> 
> Thanks
> Laurence
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Hello

I ran 20 parallel runs with 3 loops through host deletion and in each case fio survived with no hard error escaping to the FS layer.
Its solid in our test bed,
Keep in mind we have no ib_srpt loaded as we have a hardware based array and are connected directly to the array with EDR 100.
I am also not removing and reloading modules like is happening in Barts's scripts and also not trying to delete mpath maps etc.

I focused only on the I/O error that was escaping up to the FS layer.
I will check in with Bart tomorrow.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Laurence Oberman Aug. 3, 2016, 3:10 p.m. UTC | #9

----- Original Message -----
> From: "Laurence Oberman" <loberman@redhat.com>
> To: "Mike Snitzer" <snitzer@redhat.com>
> Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, dm-devel@redhat.com, linux-scsi@vger.kernel.org
> Sent: Tuesday, August 2, 2016 10:55:59 PM
> Subject: Re: dm-mq and end_clone_request()
> 
> 
> 
> ----- Original Message -----
> > From: "Laurence Oberman" <loberman@redhat.com>
> > To: "Mike Snitzer" <snitzer@redhat.com>
> > Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, dm-devel@redhat.com,
> > linux-scsi@vger.kernel.org
> > Sent: Tuesday, August 2, 2016 10:18:30 PM
> > Subject: Re: dm-mq and end_clone_request()
> > 
> > 
> > 
> > ----- Original Message -----
> > > From: "Mike Snitzer" <snitzer@redhat.com>
> > > To: "Laurence Oberman" <loberman@redhat.com>
> > > Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, dm-devel@redhat.com,
> > > linux-scsi@vger.kernel.org
> > > Sent: Tuesday, August 2, 2016 10:10:12 PM
> > > Subject: Re: dm-mq and end_clone_request()
> > > 
> > > On Tue, Aug 02 2016 at  9:33pm -0400,
> > > Laurence Oberman <loberman@redhat.com> wrote:
> > > 
> > > > Hi Bart
> > > > 
> > > > I simplified the test to 2 simple scripts and only running against one
> > > > XFS
> > > > file system.
> > > > Can you validate these and tell me if its enough to emulate what you
> > > > are
> > > > doing.
> > > > Perhaps our test-suite is too simple.
> > > > 
> > > > Start the test
> > > > 
> > > > # cat run_test.sh
> > > > #!/bin/bash
> > > > logger "Starting Bart's test"
> > > > #for i in `seq 1 10`
> > > > for i in 1
> > > > do
> > > > 	fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \
> > > >         --iodepth=64 --group_reporting --sync=1 --direct=1
> > > >         --ioengine=libaio \
> > > >         --directory="/data-$i" --name=data-integrity-test --thread
> > > >         --numjobs=16 \
> > > >         --runtime=600 --output=fio-output.txt >/dev/null &
> > > > done
> > > > 
> > > > Delete the host, I wait 10s in between host deletions.
> > > > But I also tested with 3s and still its stable with Mike's patches.
> > > > 
> > > > #!/bin/bash
> > > > for i in /sys/class/srp_remote_ports/*
> > > > do
> > > >  echo "Deleting host $i, it will re-connect via srp_daemon"
> > > >  echo 1 > $i/delete
> > > >  sleep 10
> > > > done
> > > > 
> > > > Check for I/O errors affecting XFS and we now have none with the
> > > > patches
> > > > Mike provided.
> > > > After recovery I can create files in the xfs mount with no issues.
> > > > 
> > > > Can you use my scripts and 1 mount and see if it still fails for you.
> > > 
> > > In parallel we can try Bart's testsuite that he shared earlier in this
> > > thread: https://github.com/bvanassche/srp-test
> > > 
> > > README.md says:
> > > "Running these tests manually is tedious. Hence this test suite that
> > > tests the SRP initiator and target drivers by loading both drivers on
> > > the same server, by logging in using the IB loopback functionality and
> > > by sending I/O through the SRP initiator driver to a RAM disk exported
> > > by the SRP target driver."
> > > 
> > > This could explain why Bart is still seeing issues.  He isn't testing
> > > real hardware -- as such he is using ramdisk to expose races, etc.
> > > 
> > > Mike
> > > 
> > 
> > Hi Mike,
> > 
> > I looked at Bart's scripts, they looked fine but I wanted a more simplified
> > way to bring the error out.
> > Using ramdisk is not uncommon as an LIO backend via ib_srpt to serve LUNS.
> > That is the same way I do it when I am not connected to a large array as it
> > is the only way I can get EDR like speeds.
> > 
> > I don't thinks its racing due to the ramdisk back-end but  maybe we need to
> > ramp ours up to run more in parallel in a loop.
> > 
> > I will run 21 parallel runs and see if it makes a difference tonight and
> > report back tomorrow.
> > Clearly prior to your final patches we were escaping back to the FS layer
> > with errors but since your patches, at least in out test harness that is
> > resolved.
> > 
> > Thanks
> > Laurence
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> Hello
> 
> I ran 20 parallel runs with 3 loops through host deletion and in each case
> fio survived with no hard error escaping to the FS layer.
> Its solid in our test bed,
> Keep in mind we have no ib_srpt loaded as we have a hardware based array and
> are connected directly to the array with EDR 100.
> I am also not removing and reloading modules like is happening in Barts's
> scripts and also not trying to delete mpath maps etc.
> 
> I focused only on the I/O error that was escaping up to the FS layer.
> I will check in with Bart tomorrow.
> 
> Thanks
> Laurence
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Hi Bart

Looking back at your email.

I also get these, but those are expected as we are in the process of doing I/O when we yank the hosts and that has in-flights affected.

Aug  2 22:41:23 jumpclient kernel: device-mapper: multipath: Failing path 8:192.
Aug  2 22:41:23 jumpclient kernel: blk_update_request: I/O error, dev sdm, sector 258504
Aug  2 22:41:23 jumpclient kernel: blk_update_request: I/O error, dev sdm, sector 60320

However I never get any of these any more (with the patches applied) that you show:
[  162.903284] Buffer I/O error on dev dm-0, logical block 32928, lost sync page write

I will work with you to understand why with Mike's patches, its now stable here but not in your configuration

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bart Van Assche Aug. 3, 2016, 4:06 p.m. UTC | #10

On 08/02/2016 06:33 PM, Laurence Oberman wrote:
> #!/bin/bash
> for i in /sys/class/srp_remote_ports/*
> do
>  echo "Deleting host $i, it will re-connect via srp_daemon"
>  echo 1 > $i/delete
>  sleep 10
> done

Hello Laurence,

Sorry but the above looks wrong to me. There should be a second loop 
around this loop and the sleep statement should be moved from the inner 
loop to the outer loop. The above code logs out one (initiator, target) 
port pair at a time instead of logging out all paths at once.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bart Van Assche Aug. 3, 2016, 4:55 p.m. UTC | #11

On 08/02/2016 05:40 PM, Mike Snitzer wrote:
> But I asked you to run the v4.7 kernel patches I
> pointed to _without_ any of your debug patches.

I need several patches to fix bugs that are not related to the device 
mapper, e.g. "sched: Avoid that __wait_on_bit_lock() hangs" 
(https://lkml.org/lkml/2016/8/3/289).

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Laurence Oberman Aug. 3, 2016, 5:25 p.m. UTC | #12

----- Original Message -----
> From: "Bart Van Assche" <bart.vanassche@sandisk.com>
> To: "Laurence Oberman" <loberman@redhat.com>, "Mike Snitzer" <snitzer@redhat.com>
> Cc: dm-devel@redhat.com, linux-scsi@vger.kernel.org
> Sent: Wednesday, August 3, 2016 12:06:17 PM
> Subject: Re: [dm-devel] dm-mq and end_clone_request()
> 
> On 08/02/2016 06:33 PM, Laurence Oberman wrote:
> > #!/bin/bash
> > for i in /sys/class/srp_remote_ports/*
> > do
> >  echo "Deleting host $i, it will re-connect via srp_daemon"
> >  echo 1 > $i/delete
> >  sleep 10
> > done
> 
> Hello Laurence,
> 
> Sorry but the above looks wrong to me. There should be a second loop
> around this loop and the sleep statement should be moved from the inner
> loop to the outer loop. The above code logs out one (initiator, target)
> port pair at a time instead of logging out all paths at once.
> 
> Bart.
> 

Hi Bart

It logs out each host in turn with a 10s sleep in between.
I actually reduced the sleep to 3s last night.
We do land up with all paths lost but not at precisely the same second.

Are you saying we have to lose all paths at the same time.
That is easy to fix and I was running it that way in beginning, I will re-test.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Laurence Oberman Aug. 3, 2016, 6:03 p.m. UTC | #13

----- Original Message -----
> From: "Bart Van Assche" <bart.vanassche@sandisk.com>
> To: "Laurence Oberman" <loberman@redhat.com>, "Mike Snitzer" <snitzer@redhat.com>
> Cc: dm-devel@redhat.com, linux-scsi@vger.kernel.org
> Sent: Wednesday, August 3, 2016 12:06:17 PM
> Subject: Re: [dm-devel] dm-mq and end_clone_request()
> 
> On 08/02/2016 06:33 PM, Laurence Oberman wrote:
> > #!/bin/bash
> > for i in /sys/class/srp_remote_ports/*
> > do
> >  echo "Deleting host $i, it will re-connect via srp_daemon"
> >  echo 1 > $i/delete
> >  sleep 10
> > done
> 
> Hello Laurence,
> 
> Sorry but the above looks wrong to me. There should be a second loop
> around this loop and the sleep statement should be moved from the inner
> loop to the outer loop. The above code logs out one (initiator, target)
> port pair at a time instead of logging out all paths at once.
> 
> Bart.
> 

Hi Bart

Latest tests are still good on our side.
I am now taking both paths out at the same time but still we seem stable here.
First test removed sleep and we still had a delay, second test add a background so they ran as close as possible to the same time.
Both tests passed.

I will email messages log just to you.

With no sleep we still have a gap when we delete paths of 9s and we are good.

Aug  3 13:41:21 jumpclient multipathd: 360001ff0b035d000000000008d700001: remaining active paths: 1
Aug  3 13:41:22 jumpclient multipathd: 360001ff0b035d000000000028d720003: remaining active paths: 1
Aug  3 13:41:22 jumpclient multipathd: 360001ff0b035d000000000048d740005: remaining active paths: 1
Aug  3 13:41:22 jumpclient multipathd: 360001ff0b035d000000000068d760007: remaining active paths: 1
Aug  3 13:41:23 jumpclient multipathd: 360001ff0b035d0000000000b8d7b000c: remaining active paths: 1
Aug  3 13:41:23 jumpclient multipathd: 360001ff0b035d0000000000d8d7d000e: remaining active paths: 1
Aug  3 13:41:23 jumpclient multipathd: 360001ff0b035d000000000118d810012: remaining active paths: 1
Aug  3 13:41:24 jumpclient multipathd: 360001ff0b035d000000000138d830014: remaining active paths: 1
Aug  3 13:41:24 jumpclient multipathd: 360001ff0b035d000000000158d850016: remaining active paths: 1
Aug  3 13:41:25 jumpclient multipathd: 360001ff0b035d000000000178d870018: remaining active paths: 1
Aug  3 13:41:25 jumpclient multipathd: 360001ff0b035d000000000198d89001a: remaining active paths: 1
Aug  3 13:41:25 jumpclient multipathd: 360001ff0b035d0000000001a8d8a001b: remaining active paths: 1
Aug  3 13:41:25 jumpclient multipathd: 360001ff0b035d0000000001c8d8c001d: remaining active paths: 1
Aug  3 13:41:26 jumpclient multipathd: 360001ff0b035d0000000001e8d8e001f: remaining active paths: 1
Aug  3 13:41:26 jumpclient multipathd: 360001ff0b035d0000000001f8d8f0020: remaining active paths: 1
Aug  3 13:41:26 jumpclient multipathd: 360001ff0b035d000000000208d900021: remaining active paths: 1
Aug  3 13:41:26 jumpclient multipathd: 360001ff0b035d000000000228d920023: remaining active paths: 1
Aug  3 13:41:28 jumpclient multipathd: 360001ff0b035d000000000248d940025: remaining active paths: 1
Aug  3 13:41:29 jumpclient multipathd: 360001ff0b035d000000000268d960027: remaining active paths: 1
Aug  3 13:41:29 jumpclient multipathd: 360001ff0b035d000000000278d970028: remaining active paths: 1
Aug  3 13:41:30 jumpclient multipathd: 360001ff0b035d000000000288d980029: remaining active paths: 1
Aug  3 13:41:35 jumpclient multipathd: 360001ff0b035d000000000008d700001: remaining active paths: 0
Aug  3 13:41:36 jumpclient multipathd: 360001ff0b035d000000000028d720003: remaining active paths: 0
Aug  3 13:41:37 jumpclient multipathd: 360001ff0b035d000000000048d740005: remaining active paths: 0
Aug  3 13:41:37 jumpclient multipathd: 360001ff0b035d000000000068d760007: remaining active paths: 0
Aug  3 13:41:38 jumpclient multipathd: 360001ff0b035d0000000000b8d7b000c: remaining active paths: 0
Aug  3 13:41:38 jumpclient multipathd: 360001ff0b035d0000000000d8d7d000e: remaining active paths: 0
Aug  3 13:41:38 jumpclient multipathd: 360001ff0b035d000000000108d800011: remaining active paths: 0
Aug  3 13:41:38 jumpclient multipathd: 360001ff0b035d000000000118d810012: remaining active paths: 0
Aug  3 13:41:38 jumpclient multipathd: 360001ff0b035d000000000138d830014: remaining active paths: 0
Aug  3 13:41:39 jumpclient multipathd: 360001ff0b035d000000000158d850016: remaining active paths: 0
Aug  3 13:41:39 jumpclient multipathd: 360001ff0b035d000000000178d870018: remaining active paths: 0
Aug  3 13:41:39 jumpclient multipathd: 360001ff0b035d000000000198d89001a: remaining active paths: 0
Aug  3 13:41:39 jumpclient multipathd: 360001ff0b035d0000000001a8d8a001b: remaining active paths: 0
Aug  3 13:41:39 jumpclient multipathd: 360001ff0b035d0000000001c8d8c001d: remaining active paths: 0
Aug  3 13:41:39 jumpclient multipathd: 360001ff0b035d0000000001e8d8e001f: remaining active paths: 0
Aug  3 13:41:41 jumpclient multipathd: 360001ff0b035d0000000001f8d8f0020: remaining active paths: 0
Aug  3 13:41:41 jumpclient multipathd: 360001ff0b035d000000000208d900021: remaining active paths: 0
Aug  3 13:41:43 jumpclient multipathd: 360001ff0b035d000000000248d940025: remaining active paths: 0
Aug  3 13:41:43 jumpclient multipathd: 360001ff0b035d000000000268d960027: remaining active paths: 0
Aug  3 13:41:44 jumpclient multipathd: 360001ff0b035d000000000288d980029: remaining active paths: 0
Aug  3 13:42:44 jumpclient multipathd: 360001ff0b035d000000000138d830014: remaining active paths: 2
Aug  3 13:42:44 jumpclient multipathd: 360001ff0b035d000000000158d850016: remaining active paths: 2

These are the only errors and they are expected.

Aug  3 13:41:22 jumpclient kernel: blk_update_request: I/O error, dev sdd, sector 31141264880
Aug  3 13:41:22 jumpclient kernel: blk_update_request: I/O error, dev sdd, sector 79928
Aug  3 13:41:22 jumpclient kernel: blk_update_request: I/O error, dev sdd, sector 65264
Aug  3 13:41:22 jumpclient kernel: blk_update_request: I/O error, dev sdd, sector 55232
Aug  3 13:41:22 jumpclient kernel: blk_update_request: I/O error, dev sdd, sector 14152
Aug  3 13:41:22 jumpclient kernel: blk_update_request: I/O error, dev sdd, sector 168880
Aug  3 13:41:22 jumpclient kernel: blk_update_request: I/O error, dev sdd, sector 269392
Aug  3 13:41:22 jumpclient kernel: blk_update_request: I/O error, dev sdd, sector 309200
Aug  3 13:41:22 jumpclient kernel: blk_update_request: I/O error, dev sdd, sector 87520
Aug  3 13:41:22 jumpclient kernel: blk_update_request: I/O error, dev sdd, sector 7744
Aug  3 13:41:28 jumpclient kernel: blk_update_request: I/O error, dev sdca, sector 119984
Aug  3 13:41:29 jumpclient kernel: blk_update_request: I/O error, dev sdcd, sector 31139908984
Aug  3 13:41:29 jumpclient kernel: blk_update_request: I/O error, dev sdcd, sector 131136
Aug  3 13:41:29 jumpclient kernel: blk_update_request: I/O error, dev sdcd, sector 97536
Aug  3 13:41:29 jumpclient kernel: blk_update_request: I/O error, dev sdcd, sector 123264
Aug  3 13:41:29 jumpclient kernel: blk_update_request: I/O error, dev sdcd, sector 110336
Aug  3 13:41:29 jumpclient kernel: blk_update_request: I/O error, dev sdcd, sector 158136
Aug  3 13:41:29 jumpclient kernel: blk_update_request: I/O error, dev sdcd, sector 156136
Aug  3 13:41:29 jumpclient kernel: blk_update_request: I/O error, dev sdcd, sector 173072
Aug  3 13:41:29 jumpclient kernel: blk_update_request: I/O error, dev sdcd, sector 6984
Aug  3 13:41:35 jumpclient kernel: blk_update_request: I/O error, dev sdc, sector 130224
Aug  3 13:41:35 jumpclient kernel: blk_update_request: I/O error, dev sdc, sector 225816
Aug  3 13:41:36 jumpclient kernel: blk_update_request: I/O error, dev sdc, sector 248120
Aug  3 13:41:36 jumpclient kernel: blk_update_request: I/O error, dev sdc, sector 242528
Aug  3 13:41:36 jumpclient kernel: blk_update_request: I/O error, dev sdk, sector 251248
Aug  3 13:41:36 jumpclient kernel: blk_update_request: I/O error, dev sdk, sector 242032
Aug  3 13:41:36 jumpclient kernel: blk_update_request: I/O error, dev sdk, sector 203736
Aug  3 13:41:36 jumpclient kernel: blk_update_request: I/O error, dev sdk, sector 31141107808
Aug  3 13:41:36 jumpclient kernel: blk_update_request: I/O error, dev sdk, sector 233336
Aug  3 13:41:36 jumpclient kernel: blk_update_request: I/O error, dev sdk, sector 187944
Aug  3 13:41:41 jumpclient kernel: blk_update_request: I/O error, dev sdbl, sector 85800
Aug  3 13:41:41 jumpclient kernel: blk_update_request: I/O error, dev sdbl, sector 74120
Aug  3 13:41:41 jumpclient kernel: blk_update_request: I/O error, dev sdbl, sector 78216
Aug  3 13:41:41 jumpclient kernel: blk_update_request: I/O error, dev sdbl, sector 79976
Aug  3 13:41:41 jumpclient kernel: blk_update_request: I/O error, dev sdbl, sector 79552
Aug  3 13:41:41 jumpclient kernel: blk_update_request: I/O error, dev sdbl, sector 87888
Aug  3 13:41:43 jumpclient kernel: blk_update_request: I/O error, dev sdbt, sector 274368
Aug  3 13:41:43 jumpclient kernel: blk_update_request: I/O error, dev sdbt, sector 31139814080
Aug  3 13:41:43 jumpclient kernel: blk_update_request: I/O error, dev sdbx, sector 6776
Aug  3 13:41:43 jumpclient kernel: blk_update_request: I/O error, dev sdbx, sector 302152

Changing script to add & we take both away at the same time but still we seem to survive here.

This is my configuration:

360001ff0b035d000000000078d770008 dm-9 DDN     ,SFA14K          
size=29T features='3 queue_if_no_path queue_mode mq' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=90 status=active
| `- 1:0:0:7  sday 67:32  active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  `- 2:0:0:7  sdj  8:144  active ready running

       device {
        vendor                  "DDN"
        product                 "SFA14K"
        path_grouping_policy    group_by_prio
        prio                    alua
        path_selector           "round-robin 0"
        path_checker            tur
        failback                2
        rr_weight               uniform
        no_path_retry           12
        dev_loss_tmo            10
        fast_io_fail_tmo        5
	features     "1 queue_if_no_path"
    }


With &

#!/bin/bash
for i in /sys/class/srp_remote_ports/*
do
 echo "Deleting host $i, it will re-connect via srp_daemon" 
 echo 1 > $i/delete &
 #sleep 3
done

Here we lose both paths at the same time.

[root@jumpclient bart_tests]# grep remaining messages
Aug  3 13:49:38 jumpclient multipathd: 360001ff0b035d000000000008d700001: remaining active paths: 0
Aug  3 13:49:38 jumpclient multipathd: 360001ff0b035d000000000028d720003: remaining active paths: 0
Aug  3 13:49:38 jumpclient multipathd: 360001ff0b035d000000000048d740005: remaining active paths: 0
Aug  3 13:49:41 jumpclient multipathd: 360001ff0b035d000000000068d760007: remaining active paths: 0
Aug  3 13:49:42 jumpclient multipathd: 360001ff0b035d0000000000d8d7d000e: remaining active paths: 0
Aug  3 13:49:45 jumpclient multipathd: 360001ff0b035d000000000118d810012: remaining active paths: 0
Aug  3 13:49:45 jumpclient multipathd: 360001ff0b035d000000000108d800011: remaining active paths: 0
Aug  3 13:49:47 jumpclient multipathd: 360001ff0b035d000000000158d850016: remaining active paths: 0
Aug  3 13:49:48 jumpclient multipathd: 360001ff0b035d000000000178d870018: remaining active paths: 0
Aug  3 13:49:48 jumpclient multipathd: 360001ff0b035d000000000198d89001a: remaining active paths: 0
Aug  3 13:49:48 jumpclient multipathd: 360001ff0b035d0000000001a8d8a001b: remaining active paths: 0
Aug  3 13:49:55 jumpclient multipathd: 360001ff0b035d0000000001e8d8e001f: remaining active paths: 0
Aug  3 13:49:55 jumpclient multipathd: 360001ff0b035d0000000001f8d8f0020: remaining active paths: 0
Aug  3 13:49:58 jumpclient multipathd: 360001ff0b035d000000000248d940025: remaining active paths: 0
Aug  3 13:49:59 jumpclient multipathd: 360001ff0b035d000000000268d960027: remaining active paths: 0
Aug  3 13:50:00 jumpclient multipathd: 360001ff0b035d000000000288d980029: remaining active paths: 0
Aug  3 13:51:17 jumpclient multipathd: 360001ff0b035d000000000038d730004: remaining active paths: 2
Aug  3 13:51:17 jumpclient multipathd: 360001ff0b035d000000000028d720003: remaining active paths: 2
Aug  3 13:51:19 jumpclient multipathd: 360001ff0b035d000000000078d770008: remaining active paths: 2
Aug  3 13:51:20 jumpclient multipathd: 360001ff0b035d0000000000a8d7a000b: remaining active paths: 2
Aug  3 13:51:23 jumpclient multipathd: 360001ff0b035d0000000000d8d7d000e: remaining active paths: 2
Aug  3 13:51:24 jumpclient multipathd: 360001ff0b035d000000000108d800011: remaining active paths: 2
Aug  3 13:51:25 jumpclient multipathd: 360001ff0b035d0000000000f8d7f0010: remaining active paths: 2
Aug  3 13:51:26 jumpclient multipathd: 360001ff0b035d000000000128d820013: remaining active paths: 2
Aug  3 13:51:29 jumpclient multipathd: 360001ff0b035d0000000001c8d8c001d: remaining active paths: 2
Aug  3 13:51:33 jumpclient multipathd: 360001ff0b035d000000000228d920023: remaining active paths: 2
Aug  3 13:51:34 jumpclient multipathd: 360001ff0b035d000000000238d930024: remaining active paths: 2

We still survive.

[root@jumpclient bart_tests]# grep -i error messages
Aug  3 13:49:38 jumpclient kernel: blk_update_request: I/O error, dev sdc, sector 98288
Aug  3 13:49:38 jumpclient kernel: blk_update_request: I/O error, dev sdc, sector 98320
Aug  3 13:49:38 jumpclient kernel: blk_update_request: I/O error, dev sdc, sector 46976
Aug  3 13:49:38 jumpclient kernel: blk_update_request: I/O error, dev sde, sector 216720
Aug  3 13:49:38 jumpclient kernel: blk_update_request: I/O error, dev sdg, sector 130672
Aug  3 13:49:41 jumpclient kernel: blk_update_request: I/O error, dev sdi, sector 56984
Aug  3 13:49:41 jumpclient kernel: blk_update_request: I/O error, dev sdi, sector 56120
Aug  3 13:49:41 jumpclient kernel: blk_update_request: I/O error, dev sdi, sector 62112
Aug  3 13:49:42 jumpclient kernel: blk_update_request: I/O error, dev sdp, sector 156944
Aug  3 13:49:42 jumpclient kernel: blk_update_request: I/O error, dev sdp, sector 31140975496
Aug  3 13:49:44 jumpclient kernel: blk_update_request: I/O error, dev sdt, sector 207392
Aug  3 13:49:44 jumpclient kernel: blk_update_request: I/O error, dev sdt, sector 200568
Aug  3 13:49:44 jumpclient kernel: blk_update_request: I/O error, dev sdt, sector 251048
Aug  3 13:49:44 jumpclient kernel: blk_update_request: I/O error, dev sdt, sector 247616
Aug  3 13:49:44 jumpclient kernel: blk_update_request: I/O error, dev sdt, sector 210592
Aug  3 13:49:44 jumpclient kernel: blk_update_request: I/O error, dev sdt, sector 200120
Aug  3 13:49:44 jumpclient kernel: blk_update_request: I/O error, dev sdt, sector 203000
Aug  3 13:49:44 jumpclient kernel: blk_update_request: I/O error, dev sdt, sector 248640
Aug  3 13:49:47 jumpclient kernel: blk_update_request: I/O error, dev sdx, sector 48232
Aug  3 13:49:48 jumpclient kernel: blk_update_request: I/O error, dev sdz, sector 9984
Aug  3 13:49:55 jumpclient kernel: blk_update_request: I/O error, dev sdag, sector 130512
Aug  3 13:49:58 jumpclient kernel: blk_update_request: I/O error, dev sdai, sector 39040
Aug  3 13:49:58 jumpclient kernel: blk_update_request: I/O error, dev sdam, sector 31140570528
Aug  3 13:49:59 jumpclient kernel: blk_update_request: I/O error, dev sdao, sector 204552
Aug  3 13:50:00 jumpclient kernel: blk_update_request: I/O error, dev sdaq, sector 31142052904



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mike Snitzer Aug. 4, 2016, 4:10 p.m. UTC | #14

On Wed, Aug 03 2016 at 12:55pm -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> On 08/02/2016 05:40 PM, Mike Snitzer wrote:
> >But I asked you to run the v4.7 kernel patches I
> >pointed to _without_ any of your debug patches.
> 
> I need several patches to fix bugs that are not related to the
> device mapper, e.g. "sched: Avoid that __wait_on_bit_lock() hangs"
> (https://lkml.org/lkml/2016/8/3/289).

OK, but you have way more changes than seem needed.  In particular the
blk-mq error handling changes look suspect.

I'm also not sure what REQ_FAIL_IF_NO_PATH is all about (vaguely recall
seeing it before; and suggesting you use SCSI's more traditional
differentiated IO errors).

Anyway, at this point you're having us test too many changes that aren't
yet upstream:

$ git diff bart/srp-initiator-for-next dm/dm-4.7-mpath-fixes -- drivers block include kernel | diffstat
 block/bio-integrity.c                   |    1
 block/blk-cgroup.c                      |    4
 block/blk-core.c                        |   16 ---
 block/blk-mq.c                          |   16 ---
 block/partition-generic.c               |    3
 drivers/acpi/acpica/nswalk.c            |    1
 drivers/infiniband/core/rw.c            |   24 +++--
 drivers/infiniband/core/verbs.c         |    9 --
 drivers/infiniband/hw/hfi1/Kconfig      |    1
 drivers/infiniband/hw/mlx4/qp.c         |    6 -
 drivers/infiniband/sw/rdmavt/Kconfig    |    1
 drivers/infiniband/ulp/isert/ib_isert.c |    2
 drivers/infiniband/ulp/isert/ib_isert.h |    1
 drivers/infiniband/ulp/srp/ib_srp.c     |  131 --------------------------------
 drivers/infiniband/ulp/srp/ib_srp.h     |    5 -
 drivers/infiniband/ulp/srpt/ib_srpt.c   |   10 +-
 drivers/infiniband/ulp/srpt/ib_srpt.h   |    6 -
 drivers/md/dm-crypt.c                   |    4
 drivers/md/dm-ioctl.c                   |   77 +++++++++---------
 drivers/md/dm-mpath.c                   |   32 -------
 drivers/md/dm.c                         |   22 -----
 drivers/scsi/scsi_lib.c                 |   36 +-------
 drivers/scsi/scsi_priv.h                |    2
 drivers/scsi/scsi_scan.c                |    2
 drivers/scsi/scsi_sysfs.c               |   48 -----------
 drivers/scsi/sd.c                       |    6 -
 drivers/scsi/sg.c                       |    3
 include/linux/blk-mq.h                  |    3
 include/linux/blk_types.h               |    5 -
 include/linux/blkdev.h                  |    1
 include/linux/dmar.h                    |    2
 include/rdma/ib_verbs.h                 |    6 -
 include/scsi/scsi_device.h              |    2
 kernel/sched/wait.c                     |    2
 34 files changed, 106 insertions(+), 384 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bart Van Assche Aug. 4, 2016, 5:42 p.m. UTC | #15

On 08/04/2016 09:10 AM, Mike Snitzer wrote:
> Anyway, at this point you're having us test too many changes that aren't
> yet upstream:
>
> $ git diff bart/srp-initiator-for-next dm/dm-4.7-mpath-fixes -- drivers block include kernel | diffstat
>  block/bio-integrity.c                   |    1
>  block/blk-cgroup.c                      |    4
>  block/blk-core.c                        |   16 ---
>  block/blk-mq.c                          |   16 ---
>  block/partition-generic.c               |    3
>  drivers/acpi/acpica/nswalk.c            |    1
>  drivers/infiniband/core/rw.c            |   24 +++--
>  drivers/infiniband/core/verbs.c         |    9 --
>  drivers/infiniband/hw/hfi1/Kconfig      |    1
>  drivers/infiniband/hw/mlx4/qp.c         |    6 -
>  drivers/infiniband/sw/rdmavt/Kconfig    |    1
>  drivers/infiniband/ulp/isert/ib_isert.c |    2
>  drivers/infiniband/ulp/isert/ib_isert.h |    1
>  drivers/infiniband/ulp/srp/ib_srp.c     |  131 --------------------------------
>  drivers/infiniband/ulp/srp/ib_srp.h     |    5 -
>  drivers/infiniband/ulp/srpt/ib_srpt.c   |   10 +-
>  drivers/infiniband/ulp/srpt/ib_srpt.h   |    6 -
>  drivers/md/dm-crypt.c                   |    4
>  drivers/md/dm-ioctl.c                   |   77 +++++++++---------
>  drivers/md/dm-mpath.c                   |   32 -------
>  drivers/md/dm.c                         |   22 -----
>  drivers/scsi/scsi_lib.c                 |   36 +-------
>  drivers/scsi/scsi_priv.h                |    2
>  drivers/scsi/scsi_scan.c                |    2
>  drivers/scsi/scsi_sysfs.c               |   48 -----------
>  drivers/scsi/sd.c                       |    6 -
>  drivers/scsi/sg.c                       |    3
>  include/linux/blk-mq.h                  |    3
>  include/linux/blk_types.h               |    5 -
>  include/linux/blkdev.h                  |    1
>  include/linux/dmar.h                    |    2
>  include/rdma/ib_verbs.h                 |    6 -
>  include/scsi/scsi_device.h              |    2
>  kernel/sched/wait.c                     |    2
>  34 files changed, 106 insertions(+), 384 deletions(-)

Hello Mike,

Most of the changes you are referring to either are already upstream, 
are expected to arrive in Linus' tree later this week or only add 
debugging pr_info() statements. The changes that either are already 
upstream or that are expected to be upstream soon are:

$ for b in origin/master dledford-rdma/k.o/for-4.8-1 
dledford-rdma/k.o/for-4.8-2; do git log v4.7..$b --author="Bart Van 
Assche" | grep ^commit -A4 | sed -n 's/^    //p'; done
block: Fix spelling in a source code comment
dm ioctl: Simplify parameter buffer management code
dm crypt: Fix sparse complaints
block/blk-cgroup.c: Declare local symbols static
block/bio-integrity.c: Add #include "blk.h"
block/partition-generic.c: Remove a set-but-not-used variable
IB/hfi1: Disable by default
IB/rdmavt: Disable by default
IB/isert: Remove an unused member variable
IB/srpt: Simplify srpt_queue_response()
IB/srpt: Limit the number of SG elements per work request
IB/core, RDMA RW API: Do not exceed QP SGE send limit
IB/core: Make rdma_rw_ctx_init() initialize all used fields

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mike Snitzer Aug. 4, 2016, 11:58 p.m. UTC | #16

I've staged another fix, Laurence is seeing success with this added:
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.8&id=d50a6450104c237db1dc75314d17b78c990a8c05

I'll be sending all the fixes I've queued to Linus tonight or early
tomorrow (since I'll then be on vacation until Monday 8/15).
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Laurence Oberman Aug. 5, 2016, 1:07 a.m. UTC | #17

----- Original Message -----
> From: "Mike Snitzer" <snitzer@redhat.com>
> To: "Bart Van Assche" <bart.vanassche@sandisk.com>
> Cc: dm-devel@redhat.com, "Laurence Oberman" <loberman@redhat.com>, linux-scsi@vger.kernel.org
> Sent: Thursday, August 4, 2016 7:58:50 PM
> Subject: Re: dm-mq and end_clone_request()
> 
> I've staged another fix, Laurence is seeing success with this added:
> https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.8&id=d50a6450104c237db1dc75314d17b78c990a8c05
> 
> I'll be sending all the fixes I've queued to Linus tonight or early
> tomorrow (since I'll then be on vacation until Monday 8/15).
> 
Hello Bart,

I applied that patch to your kernel and while I still obviously see all the debug logging its no longer failing fio for me.
I ran 8 loops with 20 parallel fio runs. This was on a different server to the one I had been testing on.

However I am concerned about timing playing a part here here so let us know what you find.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Laurence Oberman Aug. 5, 2016, 11:43 a.m. UTC | #18

----- Original Message -----
> From: "Laurence Oberman" <loberman@redhat.com>
> To: "Mike Snitzer" <snitzer@redhat.com>
> Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, dm-devel@redhat.com, linux-scsi@vger.kernel.org
> Sent: Thursday, August 4, 2016 9:07:28 PM
> Subject: Re: dm-mq and end_clone_request()
> 
> 
> 
> ----- Original Message -----
> > From: "Mike Snitzer" <snitzer@redhat.com>
> > To: "Bart Van Assche" <bart.vanassche@sandisk.com>
> > Cc: dm-devel@redhat.com, "Laurence Oberman" <loberman@redhat.com>,
> > linux-scsi@vger.kernel.org
> > Sent: Thursday, August 4, 2016 7:58:50 PM
> > Subject: Re: dm-mq and end_clone_request()
> > 
> > I've staged another fix, Laurence is seeing success with this added:
> > https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.8&id=d50a6450104c237db1dc75314d17b78c990a8c05
> > 
> > I'll be sending all the fixes I've queued to Linus tonight or early
> > tomorrow (since I'll then be on vacation until Monday 8/15).
> > 
> Hello Bart,
> 
> I applied that patch to your kernel and while I still obviously see all the
> debug logging its no longer failing fio for me.
> I ran 8 loops with 20 parallel fio runs. This was on a different server to
> the one I had been testing on.
> 
> However I am concerned about timing playing a part here here so let us know
> what you find.
> 
> Thanks
> Laurence
Replying to my own message:

Hi Bart, Mike

Further testing has shown we are still exposed here so more investigation is necessary.
The above patch seems to help but I still see sporadic cases of errors escaping up the stack.

I expect you will see the same so more work to do here to figure this out.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Laurence Oberman Aug. 5, 2016, 3:39 p.m. UTC | #19

----- Original Message -----
> From: "Laurence Oberman" <loberman@redhat.com>
> To: "Mike Snitzer" <snitzer@redhat.com>
> Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, dm-devel@redhat.com, linux-scsi@vger.kernel.org
> Sent: Friday, August 5, 2016 7:43:30 AM
> Subject: Re: dm-mq and end_clone_request()
> 
> 
> 
> ----- Original Message -----
> > From: "Laurence Oberman" <loberman@redhat.com>
> > To: "Mike Snitzer" <snitzer@redhat.com>
> > Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, dm-devel@redhat.com,
> > linux-scsi@vger.kernel.org
> > Sent: Thursday, August 4, 2016 9:07:28 PM
> > Subject: Re: dm-mq and end_clone_request()
> > 
> > 
> > 
> > ----- Original Message -----
> > > From: "Mike Snitzer" <snitzer@redhat.com>
> > > To: "Bart Van Assche" <bart.vanassche@sandisk.com>
> > > Cc: dm-devel@redhat.com, "Laurence Oberman" <loberman@redhat.com>,
> > > linux-scsi@vger.kernel.org
> > > Sent: Thursday, August 4, 2016 7:58:50 PM
> > > Subject: Re: dm-mq and end_clone_request()
> > > 
> > > I've staged another fix, Laurence is seeing success with this added:
> > > https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.8&id=d50a6450104c237db1dc75314d17b78c990a8c05
> > > 
> > > I'll be sending all the fixes I've queued to Linus tonight or early
> > > tomorrow (since I'll then be on vacation until Monday 8/15).
> > > 
> > Hello Bart,
> > 
> > I applied that patch to your kernel and while I still obviously see all the
> > debug logging its no longer failing fio for me.
> > I ran 8 loops with 20 parallel fio runs. This was on a different server to
> > the one I had been testing on.
> > 
> > However I am concerned about timing playing a part here here so let us know
> > what you find.
> > 
> > Thanks
> > Laurence
> Replying to my own message:
> 
> Hi Bart, Mike
> 
> Further testing has shown we are still exposed here so more investigation is
> necessary.
> The above patch seems to help but I still see sporadic cases of errors
> escaping up the stack.
> 
> I expect you will see the same so more work to do here to figure this out.
> 
> Thanks
> Laurence
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Hello Bart

I completely forgot I had set no_path_retry=12, so after 12 retries it will error out.
This is likely why I had different results seemingly affected by timing.
Mike reminded me of it this morning.

What do you have set for no_path_retry, because when I set it to queue, it blocks the paths coming back for some reason.
I am now investigating why that is happening :).
I see now I need to add "simultaneous all paths lost" scenarios to my QA testing, as its not a common scenario.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bart Van Assche Aug. 5, 2016, 3:43 p.m. UTC | #20

On 08/05/2016 08:39 AM, Laurence Oberman wrote:
> I completely forgot I had set no_path_retry=12, so after 12 retries it will error out.
> This is likely why I had different results seemingly affected by timing.
> Mike reminded me of it this morning.
> 
> What do you have set for no_path_retry, because when I set it to queue, it blocks the paths coming back for some reason.
> I am now investigating why that is happening :).
> I see now I need to add "simultaneous all paths lost" scenarios to my QA testing, as its not a common scenario.

Hello Laurence,

I'm using the following multipath.conf file for the tests I run:

defaults {
	user_friendly_names	yes
	queue_without_daemon	no
}

blacklist {
	device {
		vendor			"ATA"
		product			".*"
	}
}

devices {
	device {
		vendor			"SCST_BIO|LIO-ORG"
		product			".*"
		features		"3 queue_if_no_path pg_init_retries 50"
		path_grouping_policy	group_by_prio
		path_selector		"queue-length 0"
		path_checker		tur
	}
}

blacklist_exceptions {
        property        ".*"
}

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bart Van Assche Aug. 5, 2016, 6:40 p.m. UTC | #21

On 08/04/2016 04:58 PM, Mike Snitzer wrote:
> I've staged another fix, Laurence is seeing success with this added:
> https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.8&id=d50a6450104c237db1dc75314d17b78c990a8c05

Thanks Mike. I have started testing that fix this morning.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bart Van Assche Aug. 5, 2016, 6:42 p.m. UTC | #22

On 08/05/2016 04:43 AM, Laurence Oberman wrote:
> Further testing has shown we are still exposed here so more investigation is necessary.
> The above patch seems to help but I still see sporadic cases of errors escaping up the stack.
>
> I expect you will see the same so more work to do here to figure this out.

Hello Laurence,

Unfortunately I also still see sporadic I/O errors when testing 
all-paths-down with CONFIG_DM_MQ_DEFAULT=n (I have not yet tried to 
retest with CONFIG_DM_MQ_DEFAULT=y).

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Laurence Oberman Aug. 6, 2016, 2:47 p.m. UTC | #23

----- Original Message -----
> From: "Bart Van Assche" <bart.vanassche@sandisk.com>
> To: "Laurence Oberman" <loberman@redhat.com>, "Mike Snitzer" <snitzer@redhat.com>
> Cc: dm-devel@redhat.com, linux-scsi@vger.kernel.org
> Sent: Friday, August 5, 2016 2:42:49 PM
> Subject: Re: [dm-devel] dm-mq and end_clone_request()
> 
> On 08/05/2016 04:43 AM, Laurence Oberman wrote:
> > Further testing has shown we are still exposed here so more investigation
> > is necessary.
> > The above patch seems to help but I still see sporadic cases of errors
> > escaping up the stack.
> >
> > I expect you will see the same so more work to do here to figure this out.
> 
> Hello Laurence,
> 
> Unfortunately I also still see sporadic I/O errors when testing
> all-paths-down with CONFIG_DM_MQ_DEFAULT=n (I have not yet tried to
> retest with CONFIG_DM_MQ_DEFAULT=y).
> 
> Bart.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Hello Bart,

I am still debugging this, now that I have no_path_retry=queue and not a count :)
I am often hitting the host delete race, have you seen this on your testing during debugging.

I am using your kernel built from your git tree that has  Mikes patches applied.
4.7.0bart

[66813.896159] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[66813.933246] Workqueue: srp_remove srp_remove_work [ib_srp]
[66813.964703]  0000000000000086 00000000d185b9ce ffff88060fa03d20 ffffffff813456df
[66814.007292]  0000000000000000 0000000000000000 ffff88060fa03d60 ffffffff81089fb1
[66814.049336]  0000007da067604b ffff880c01643d80 0000000000017ec0 ffff880c016447dc
[66814.091725] Call Trace:
[66814.104775]  <IRQ>  [<ffffffff813456df>] dump_stack+0x63/0x84
[66814.136507]  [<ffffffff81089fb1>] __warn+0xd1/0xf0
[66814.163118]  [<ffffffff8108a0ed>] warn_slowpath_null+0x1d/0x20
[66814.195409]  [<ffffffff8104fd7e>] native_smp_send_reschedule+0x3e/0x40
[66814.231954]  [<ffffffff810b47db>] try_to_wake_up+0x30b/0x390
[66814.263661]  [<ffffffff810b4912>] default_wake_function+0x12/0x20
[66814.297713]  [<ffffffff810ccb05>] __wake_up_common+0x55/0x90
[66814.330021]  [<ffffffff810ccb53>] __wake_up_locked+0x13/0x20
[66814.361906]  [<ffffffff81261179>] ep_poll_callback+0xb9/0x200
[66814.392784]  [<ffffffff810ccb05>] __wake_up_common+0x55/0x90
[66814.424908]  [<ffffffff810ccc59>] __wake_up+0x39/0x50
[66814.454327]  [<ffffffff810e1f80>] wake_up_klogd_work_func+0x40/0x60
[66814.490152]  [<ffffffff81177b6d>] irq_work_run_list+0x4d/0x70
[66814.523007]  [<ffffffff810710d0>] ? do_flush_tlb_all+0x50/0x50
[66814.556161]  [<ffffffff81177bbc>] irq_work_run+0x2c/0x30
[66814.586677]  [<ffffffff8110ab5f>] flush_smp_call_function_queue+0x8f/0x160
[66814.625667]  [<ffffffff8110b613>] generic_smp_call_function_single_interrupt+0x13/0x60
[66814.669276]  [<ffffffff81050167>] smp_call_function_interrupt+0x27/0x40
[66814.706255]  [<ffffffff816c7e9c>] call_function_interrupt+0x8c/0xa0
[66814.741406]  <EOI>  [<ffffffff8118e733>] ? panic+0x1ef/0x233
[66814.772851]  [<ffffffff8118e72f>] ? panic+0x1eb/0x233
[66814.800207]  [<ffffffff810308f8>] oops_end+0xb8/0xd0
[66814.827454]  [<ffffffff8106977e>] no_context+0x13e/0x3a0
[66814.858368]  [<ffffffff811f3feb>] ? __slab_free+0x9b/0x280
[66814.890365]  [<ffffffff81069ace>] __bad_area_nosemaphore+0xee/0x1d0
[66814.926508]  [<ffffffff81069bc4>] bad_area_nosemaphore+0x14/0x20
[66814.959939]  [<ffffffff8106a269>] __do_page_fault+0x89/0x4a0
[66814.992039]  [<ffffffff811f3feb>] ? __slab_free+0x9b/0x280
[66815.023052]  [<ffffffff8106a6b0>] do_page_fault+0x30/0x80
[66815.053368]  [<ffffffff816c8b88>] page_fault+0x28/0x30
[66815.083196]  [<ffffffff814ae4e9>] ? __scsi_remove_device+0x79/0x160
[66815.117444]  [<ffffffff814ae5c2>] ? __scsi_remove_device+0x152/0x160
[66815.152051]  [<ffffffff814ac790>] scsi_forget_host+0x60/0x70
[66815.183939]  [<ffffffff814a0137>] scsi_remove_host+0x77/0x110
[66815.216152]  [<ffffffffa0677be0>] srp_remove_work+0x90/0x200 [ib_srp]
[66815.253221]  [<ffffffff810a2e72>] process_one_work+0x152/0x400
[66815.286221]  [<ffffffff810a3765>] worker_thread+0x125/0x4b0
[66815.317313]  [<ffffffff810a3640>] ? rescuer_thread+0x380/0x380
[66815.349770]  [<ffffffff810a9298>] kthread+0xd8/0xf0
[66815.376082]  [<ffffffff816c6b3f>] ret_from_fork+0x1f/0x40
[66815.404767]  [<ffffffff810a91c0>] ? kthread_park+0x60/0x60
[66815.436448] ---[ end trace bfaf79198d0976f5 ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bart Van Assche Aug. 7, 2016, 10:31 p.m. UTC | #24

On 08/06/16 07:47, Laurence Oberman wrote:
> [66813.933246] Workqueue: srp_remove srp_remove_work [ib_srp]
> [ ... ]
> [66815.152051]  [<ffffffff814ac790>] scsi_forget_host+0x60/0x70
> [66815.183939]  [<ffffffff814a0137>] scsi_remove_host+0x77/0x110
> [66815.216152]  [<ffffffffa0677be0>] srp_remove_work+0x90/0x200 [ib_srp]
> [66815.253221]  [<ffffffff810a2e72>] process_one_work+0x152/0x400
> [66815.286221]  [<ffffffff810a3765>] worker_thread+0x125/0x4b0
> [66815.317313]  [<ffffffff810a3640>] ? rescuer_thread+0x380/0x380
> [66815.349770]  [<ffffffff810a9298>] kthread+0xd8/0xf0
> [66815.376082]  [<ffffffff816c6b3f>] ret_from_fork+0x1f/0x40
> [66815.404767]  [<ffffffff810a91c0>] ? kthread_park+0x60/0x60

Hello Laurence,

This is a callstack I have not yet encountered myself during any test. 
Please provide the output of the following commands:
$ gdb /lib/modules/$(uname -r)/build/vmlinux
(gdb) list *(scsi_forget_host+0x60)

Thanks,

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Laurence Oberman Aug. 8, 2016, 12:45 p.m. UTC | #25

----- Original Message -----
> From: "Bart Van Assche" <bvanassche@acm.org>
> To: "Laurence Oberman" <loberman@redhat.com>
> Cc: "Mike Snitzer" <snitzer@redhat.com>, dm-devel@redhat.com, linux-scsi@vger.kernel.org
> Sent: Sunday, August 7, 2016 6:31:11 PM
> Subject: Re: [dm-devel] dm-mq and end_clone_request()
> 
> On 08/06/16 07:47, Laurence Oberman wrote:
> > [66813.933246] Workqueue: srp_remove srp_remove_work [ib_srp]
> > [ ... ]
> > [66815.152051]  [<ffffffff814ac790>] scsi_forget_host+0x60/0x70
> > [66815.183939]  [<ffffffff814a0137>] scsi_remove_host+0x77/0x110
> > [66815.216152]  [<ffffffffa0677be0>] srp_remove_work+0x90/0x200 [ib_srp]
> > [66815.253221]  [<ffffffff810a2e72>] process_one_work+0x152/0x400
> > [66815.286221]  [<ffffffff810a3765>] worker_thread+0x125/0x4b0
> > [66815.317313]  [<ffffffff810a3640>] ? rescuer_thread+0x380/0x380
> > [66815.349770]  [<ffffffff810a9298>] kthread+0xd8/0xf0
> > [66815.376082]  [<ffffffff816c6b3f>] ret_from_fork+0x1f/0x40
> > [66815.404767]  [<ffffffff810a91c0>] ? kthread_park+0x60/0x60
> 
> Hello Laurence,
> 
> This is a callstack I have not yet encountered myself during any test.
> Please provide the output of the following commands:
> $ gdb /lib/modules/$(uname -r)/build/vmlinux
> (gdb) list *(scsi_forget_host+0x60)
> 
> Thanks,
> 
> Bart.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
[loberman@jumptest1 linux]$ gdb vmlinux
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/loberman/bart/linux/vmlinux...done.
(gdb) list *(scsi_forget_host+0x60)
0xffffffff814ac790 is in scsi_forget_host (drivers/scsi/scsi_scan.c:1895).
1890		list_for_each_entry(sdev, &shost->__devices, siblings) {
1891			if (sdev->sdev_state == SDEV_DEL)
1892				continue;
1893			spin_unlock_irqrestore(shost->host_lock, flags);
1894			__scsi_remove_device(sdev);
1895			goto restart;
1896		}
1897		spin_unlock_irqrestore(shost->host_lock, flags);
1898	}
1899	
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

dm-mq and end_clone_request()

Commit Message

Comments

Patch