dm-mq and end_clone_request()

Message ID	077d2708-3360-d8d7-fb3c-d3a73a1e03ee@sandisk.com (mailing list archive)
State	Not Applicable, archived
Delegated to:	Mike Snitzer
Headers	show Return-Path: <dm-devel-bounces@redhat.com> Received-SPF: Pass (protection.outlook.com: domain of sandisk.com designates 63.163.107.21 as permitted sender) receiver=protection.outlook.com; client-ip=63.163.107.21; helo=milsmgep15.sandisk.com; To: Laurence Oberman <loberman@redhat.com> References: <20160801175948.GA6685@redhat.com> <4d60017e-818c-5630-549e-bf280abcf1c3@sandisk.com> <20160804235850.GB13132@redhat.com> <1649218.8261013.1470359248073.JavaMail.zimbra@redhat.com> <1931660518.8323360.1470397410683.JavaMail.zimbra@redhat.com> <73e2aeda-140d-72ab-d295-57f35139ae55@sandisk.com> <1616390775.11191.1470494853559.JavaMail.zimbra@redhat.com> <SN1PR0201MB1870C587B123A8F84A49C1BD811B0@SN1PR0201MB1870.namprd02.prod.outlook.com> <551419047.135340.1470669997660.JavaMail.zimbra@redhat.com> From: Bart Van Assche <bart.vanassche@sandisk.com> Message-ID: <077d2708-3360-d8d7-fb3c-d3a73a1e03ee@sandisk.com> Date: Mon, 8 Aug 2016 15:39:07 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2 MIME-Version: 1.0 In-Reply-To: <551419047.135340.1470669997660.JavaMail.zimbra@redhat.com> SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; CY1PR0201MB1865; 20:ytgjthAS7omF7k2Xgo7rx/k6biCbVYm3pPDO/nlxceVdSRgeERgEZzIYwhejpluyjZ97pommp8oe1o9e5XXA0/KxMU2eLR2xR/JcwWBfYci9ygO8ee7gMjZii5vknJwVnYoAzDnfRHI1PrlkqqvxUUjukJSgb4NpVKtn9WGmNSemWQL7f+NjCaUa9M08WKGN9BCrU8l2r6cJgMIERKQrRD7vas1l2a3lZ34SS8kcbQFqRLPV6R4BFJ0gK8IaUYqO Cc: dm-devel@redhat.com, linux-scsi@vger.kernel.org, Mike Snitzer <snitzer@redhat.com>, Johannes Thumshirn <jthumshirn@suse.de> Subject: Re: [dm-devel] dm-mq and end_clone_request() Precedence: junk Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com

Message ID

077d2708-3360-d8d7-fb3c-d3a73a1e03ee@sandisk.com (mailing list archive)

State

Not Applicable, archived

Delegated to:

Mike Snitzer

Headers

Received-SPF: Pass (protection.outlook.com: domain of sandisk.com designates
	63.163.107.21 as permitted sender)
	receiver=protection.outlook.com; 
	client-ip=63.163.107.21; helo=milsmgep15.sandisk.com;
To: Laurence Oberman <loberman@redhat.com>
References: <20160801175948.GA6685@redhat.com>
	<4d60017e-818c-5630-549e-bf280abcf1c3@sandisk.com>
	<20160804235850.GB13132@redhat.com>
	<1649218.8261013.1470359248073.JavaMail.zimbra@redhat.com>
	<1931660518.8323360.1470397410683.JavaMail.zimbra@redhat.com>
	<73e2aeda-140d-72ab-d295-57f35139ae55@sandisk.com>
	<1616390775.11191.1470494853559.JavaMail.zimbra@redhat.com>
	<SN1PR0201MB1870C587B123A8F84A49C1BD811B0@SN1PR0201MB1870.namprd02.prod.outlook.com>
	<551419047.135340.1470669997660.JavaMail.zimbra@redhat.com>
From: Bart Van Assche <bart.vanassche@sandisk.com>
Message-ID: <077d2708-3360-d8d7-fb3c-d3a73a1e03ee@sandisk.com>
Date: Mon, 8 Aug 2016 15:39:07 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
	Thunderbird/45.2
MIME-Version: 1.0
In-Reply-To: <551419047.135340.1470669997660.JavaMail.zimbra@redhat.com>
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Aug 2016 22:39:16.9997
	(UTC)
X-MS-Exchange-CrossTenant-Id: fcd9ea9c-ae8c-460c-ab3c-3db42d7ac64d
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=fcd9ea9c-ae8c-460c-ab3c-3db42d7ac64d; 
	Ip=[63.163.107.21]; Helo=[milsmgep15.sandisk.com]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR0201MB1865
X-Greylist: Sender IP whitelisted by DNSRBL, not delayed by
	milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]);
	Mon, 08 Aug 2016 22:39:19 +0000 (UTC)
X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com
	[10.5.110.28]); 
	Mon, 08 Aug 2016 22:39:19 +0000 (UTC) for IP:'104.47.34.44'
	DOMAIN:'mail-by2nam01on0044.outbound.protection.outlook.com'
	HELO:'NAM01-BY2-obe.outbound.protection.outlook.com'
	FROM:'Bart.VanAssche@sandisk.com' RCPT:''
X-RedHat-Spam-Score: 0.668  (BAYES_50, DCC_REPUT_13_19, DKIM_SIGNED,
	DKIM_VALID, 
	RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,
	SPF_HELO_PASS, SPF_PASS) 104.47.34.44
	mail-by2nam01on0044.outbound.protection.outlook.com
	104.47.34.44
	mail-by2nam01on0044.outbound.protection.outlook.com
	<Bart.VanAssche@sandisk.com>
X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27
X-Scanned-By: MIMEDefang 2.78 on 10.5.110.28
X-loop: dm-devel@redhat.com
Cc: dm-devel@redhat.com, linux-scsi@vger.kernel.org,
	Mike Snitzer <snitzer@redhat.com>,
	Johannes Thumshirn <jthumshirn@suse.de>
Subject: Re: [dm-devel] dm-mq and end_clone_request()
X-BeenThere: dm-devel@redhat.com
X-Mailman-Version: 2.1.12
Precedence: junk
List-Id: device-mapper development <dm-devel.redhat.com>
List-Unsubscribe: <https://www.redhat.com/mailman/options/dm-devel>,
	<mailto:dm-devel-request@redhat.com?subject=unsubscribe>
List-Archive: <https://www.redhat.com/archives/dm-devel>
List-Post: <mailto:dm-devel@redhat.com>
List-Help: <mailto:dm-devel-request@redhat.com?subject=help>
List-Subscribe: <https://www.redhat.com/mailman/listinfo/dm-devel>,
	<mailto:dm-devel-request@redhat.com?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: dm-devel-bounces@redhat.com
Errors-To: dm-devel-bounces@redhat.com
X-Virus-Scanned: ClamAV using ClamSMTP

Commit Message

Bart Van Assche Aug. 8, 2016, 10:39 p.m. UTC

On 08/08/2016 08:26 AM, Laurence Oberman wrote:
> I will test this as well.
> I have lost my DDN array today (sadly:)) but I have two systems
> back to back again using ramdisk on the one to serve LUNS.
> 
> If I pull from  https://github.com/bvanassche/linux again, and
> switch branch to srp-initiator-for-next, will I get all Mikes
> latest patches from last week + this. I guess I can just check
> myself, but might as well just ask.

Hello Laurence,

Sorry but I do not yet have a fix available for the scsi_forget_host()
crash you reported in an earlier e-mail. But Mike's latest patches
including the patch below are now available at
https://github.com/bvanassche/linux in the srp-initiator-for-next
branch. Further feedback is welcome.

Thanks,

Bart.

[PATCH] Check invariants at runtime

Warn if sdev->sdev_state != SDEV_DEL when __scsi_remove_device()
returns. Check whether all __scsi_remove_device() callers hold the
scan_mutex.
---
 drivers/scsi/scsi_sysfs.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Laurence Oberman Aug. 8, 2016, 10:52 p.m. UTC | #1

----- Original Message -----
> From: "Bart Van Assche" <bart.vanassche@sandisk.com>
> To: "Laurence Oberman" <loberman@redhat.com>
> Cc: dm-devel@redhat.com, "Mike Snitzer" <snitzer@redhat.com>, linux-scsi@vger.kernel.org, "Johannes Thumshirn"
> <jthumshirn@suse.de>
> Sent: Monday, August 8, 2016 6:39:07 PM
> Subject: Re: [dm-devel] dm-mq and end_clone_request()
> 
> On 08/08/2016 08:26 AM, Laurence Oberman wrote:
> > I will test this as well.
> > I have lost my DDN array today (sadly:)) but I have two systems
> > back to back again using ramdisk on the one to serve LUNS.
> > 
> > If I pull from  https://github.com/bvanassche/linux again, and
> > switch branch to srp-initiator-for-next, will I get all Mikes
> > latest patches from last week + this. I guess I can just check
> > myself, but might as well just ask.
> 
> Hello Laurence,
> 
> Sorry but I do not yet have a fix available for the scsi_forget_host()
> crash you reported in an earlier e-mail. But Mike's latest patches
> including the patch below are now available at
> https://github.com/bvanassche/linux in the srp-initiator-for-next
> branch. Further feedback is welcome.
> 
> Thanks,
> 
> Bart.
> 
> [PATCH] Check invariants at runtime
> 
> Warn if sdev->sdev_state != SDEV_DEL when __scsi_remove_device()
> returns. Check whether all __scsi_remove_device() callers hold the
> scan_mutex.
> ---
>  drivers/scsi/scsi_sysfs.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 82209ad4..a21e321 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -1312,6 +1312,8 @@ void __scsi_remove_device(struct scsi_device *sdev)
>  {
>  	struct device *dev = &sdev->sdev_gendev, *sdp = NULL;
>  
> +	lockdep_assert_held(&sdev->host->scan_mutex);
> +
>  	/*
>  	 * This cleanup path is not reentrant and while it is impossible
>  	 * to get a new reference with scsi_device_get() someone can still
> @@ -1321,8 +1323,11 @@ void __scsi_remove_device(struct scsi_device *sdev)
>  		return;
>  
>  	if (sdev->is_visible) {
> -		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
> +		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0) {
> +			WARN_ONCE(sdev->sdev_state != SDEV_DEL,
> +				  "sdev state %d\n", sdev->sdev_state);
>  			return;
> +		}
>  
>  		bsg_unregister_queue(sdev->request_queue);
>  		sdp = scsi_get_ulpdev(dev);
> @@ -1339,6 +1344,8 @@ void __scsi_remove_device(struct scsi_device *sdev)
>  	 * device.
>  	 */
>  	scsi_device_set_state(sdev, SDEV_DEL);
> +	WARN_ONCE(sdev->sdev_state != SDEV_DEL, "sdev state %d\n",
> +		  sdev->sdev_state);
>  	blk_cleanup_queue(sdev->request_queue);
>  	cancel_work_sync(&sdev->requeue_work);
>  
> --
> 2.9.2
> 
Hello Bart

No problem Sir. I did apply the patch just to help you test and so far it been stable.
I will revert it and carry on my debugging of the dm issue.
I do have the other patches in the original pull request I took so I am running with all Mike's patches.

Many Thanks as always for all the help you provide all of us.

Thanks
Laurence

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Laurence Oberman Aug. 9, 2016, 12:09 a.m. UTC | #2

----- Original Message -----
> From: "Laurence Oberman" <loberman@redhat.com>
> To: "Bart Van Assche" <bart.vanassche@sandisk.com>
> Cc: dm-devel@redhat.com, "Mike Snitzer" <snitzer@redhat.com>, linux-scsi@vger.kernel.org, "Johannes Thumshirn"
> <jthumshirn@suse.de>
> Sent: Monday, August 8, 2016 6:52:47 PM
> Subject: Re: [dm-devel] dm-mq and end_clone_request()
> 
> 
> 
> ----- Original Message -----
> > From: "Bart Van Assche" <bart.vanassche@sandisk.com>
> > To: "Laurence Oberman" <loberman@redhat.com>
> > Cc: dm-devel@redhat.com, "Mike Snitzer" <snitzer@redhat.com>,
> > linux-scsi@vger.kernel.org, "Johannes Thumshirn"
> > <jthumshirn@suse.de>
> > Sent: Monday, August 8, 2016 6:39:07 PM
> > Subject: Re: [dm-devel] dm-mq and end_clone_request()
> > 
> > On 08/08/2016 08:26 AM, Laurence Oberman wrote:
> > > I will test this as well.
> > > I have lost my DDN array today (sadly:)) but I have two systems
> > > back to back again using ramdisk on the one to serve LUNS.
> > > 
> > > If I pull from  https://github.com/bvanassche/linux again, and
> > > switch branch to srp-initiator-for-next, will I get all Mikes
> > > latest patches from last week + this. I guess I can just check
> > > myself, but might as well just ask.
> > 
> > Hello Laurence,
> > 
> > Sorry but I do not yet have a fix available for the scsi_forget_host()
> > crash you reported in an earlier e-mail. But Mike's latest patches
> > including the patch below are now available at
> > https://github.com/bvanassche/linux in the srp-initiator-for-next
> > branch. Further feedback is welcome.
> > 
> > Thanks,
> > 
> > Bart.
> > 
> > [PATCH] Check invariants at runtime
> > 
> > Warn if sdev->sdev_state != SDEV_DEL when __scsi_remove_device()
> > returns. Check whether all __scsi_remove_device() callers hold the
> > scan_mutex.
> > ---
> >  drivers/scsi/scsi_sysfs.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> > index 82209ad4..a21e321 100644
> > --- a/drivers/scsi/scsi_sysfs.c
> > +++ b/drivers/scsi/scsi_sysfs.c
> > @@ -1312,6 +1312,8 @@ void __scsi_remove_device(struct scsi_device *sdev)
> >  {
> >  	struct device *dev = &sdev->sdev_gendev, *sdp = NULL;
> >  
> > +	lockdep_assert_held(&sdev->host->scan_mutex);
> > +
> >  	/*
> >  	 * This cleanup path is not reentrant and while it is impossible
> >  	 * to get a new reference with scsi_device_get() someone can still
> > @@ -1321,8 +1323,11 @@ void __scsi_remove_device(struct scsi_device *sdev)
> >  		return;
> >  
> >  	if (sdev->is_visible) {
> > -		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
> > +		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0) {
> > +			WARN_ONCE(sdev->sdev_state != SDEV_DEL,
> > +				  "sdev state %d\n", sdev->sdev_state);
> >  			return;
> > +		}
> >  
> >  		bsg_unregister_queue(sdev->request_queue);
> >  		sdp = scsi_get_ulpdev(dev);
> > @@ -1339,6 +1344,8 @@ void __scsi_remove_device(struct scsi_device *sdev)
> >  	 * device.
> >  	 */
> >  	scsi_device_set_state(sdev, SDEV_DEL);
> > +	WARN_ONCE(sdev->sdev_state != SDEV_DEL, "sdev state %d\n",
> > +		  sdev->sdev_state);
> >  	blk_cleanup_queue(sdev->request_queue);
> >  	cancel_work_sync(&sdev->requeue_work);
> >  
> > --
> > 2.9.2
> > 
> Hello Bart
> 
> No problem Sir. I did apply the patch just to help you test and so far it
> been stable.
> I will revert it and carry on my debugging of the dm issue.
> I do have the other patches in the original pull request I took so I am
> running with all Mike's patches.
> 
> Many Thanks as always for all the help you provide all of us.
> 
> Thanks
> Laurence
> 
> 
Hello Bart

So now back to a 10 LUN dual path (ramdisk backed) two-server configuration I am unable to reproduce the dm issue.
Recovery is very fast with the servers connected back to back.
This is using your kernel and this multipath.conf

        device {
                vendor "LIO-ORG"
                product "*"
                path_grouping_policy "multibus"
                path_selector "round-robin 0"
                path_checker "tur"
                features "0"
                hardware_handler "0"
                no_path_retry "queue"
        }

Mikes patches have definitely stabilized this issue for me on this configuration.

I will see if I can move to a larger target server that has more memory and allocate more mpath devices.
I feel this issue in large configurations is now rooted in multipath not bringing back maps sometimes even when the actual paths are back via srp_daemon.
I am still tracking that down.

If you recall, last week I caused some of our own issues by forgetting I had a no_path_retry 12 hiding in my multipath.conf.
Since removing that and spending most of the weekend testing on the DDN array (had to give that back today), 
most of my issues were either the sporadic host delete race or multipath not re-instantiating paths.

I dont know if this helps, but since applying your latest patch I have not seen the host delete race.

Thanks
Laurence

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Bart Van Assche Aug. 9, 2016, 3:51 p.m. UTC | #3

On 08/08/2016 05:09 PM, Laurence Oberman wrote:
> So now back to a 10 LUN dual path (ramdisk backed) two-server
> configuration I am unable to reproduce the dm issue.
> Recovery is very fast with the servers connected back to back.
> This is using your kernel and this multipath.conf
> 
> [ ... ]
> 
> Mikes patches have definitely stabilized this issue for me on this
> configuration.
> 
> I will see if I can move to a larger target server that has more
> memory and allocate more mpath devices. I feel this issue in large
> configurations is now rooted in multipath not bringing back maps
> sometimes even when the actual paths are back via srp_daemon.
> I am still tracking that down.
> 
> If you recall, last week I caused some of our own issues by
> forgetting I had a no_path_retry 12 hiding in my multipath.conf.
> Since removing that and spending most of the weekend testing on
> the DDN array (had to give that back today), most of my issues
> were either the sporadic host delete race or multipath not
> re-instantiating paths.
> 
> I dont know if this helps, but since applying your latest patch I
> have not seen the host delete race.

Hello Laurence,

My latest SCSI core patch adds additional instrumentation to the SCSI
core but does not change the behavior of the SCSI core. So it cannot
fix the scsi_forget_host() crash you had reported.

On my setup, with the kernel code from the srp-initiator-for-next
branch and with CONFIG_DM_MQ_DEFAULT=n, I still see that when I run the
srp-test software that fio reports I/O errors every now and then. What
I see in syslog seems to indicate that these I/O errors are generated
by dm-mpath:

Aug  9 08:45:39 ion-dev-ib-ini kernel: mpath 254:1: queue_if_no_path 1 -> 0
Aug  9 08:45:39 ion-dev-ib-ini kernel: must_push_back: 107 callbacks suppressed
Aug  9 08:45:39 ion-dev-ib-ini kernel: device-mapper: multipath: must_push_back: queue_if_no_path=0 suspend_active=1 suspending=0
Aug  9 08:45:39 ion-dev-ib-ini kernel: __multipath_map(): (a) returning -5
Aug  9 08:45:39 ion-dev-ib-ini kernel: map_request(): clone_and_map_rq() returned -5
Aug  9 08:45:39 ion-dev-ib-ini kernel: dm_complete_request: error = -5
Aug  9 08:45:39 ion-dev-ib-ini kernel: dm_softirq_done: dm-1 tio->error = -5

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 82209ad4..a21e321 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1312,6 +1312,8 @@  void __scsi_remove_device(struct scsi_device *sdev)
 {
 	struct device *dev = &sdev->sdev_gendev, *sdp = NULL;
 
+	lockdep_assert_held(&sdev->host->scan_mutex);
+
 	/*
 	 * This cleanup path is not reentrant and while it is impossible
 	 * to get a new reference with scsi_device_get() someone can still
@@ -1321,8 +1323,11 @@  void __scsi_remove_device(struct scsi_device *sdev)
 		return;
 
 	if (sdev->is_visible) {
-		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
+		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0) {
+			WARN_ONCE(sdev->sdev_state != SDEV_DEL,
+				  "sdev state %d\n", sdev->sdev_state);
 			return;
+		}
 
 		bsg_unregister_queue(sdev->request_queue);
 		sdp = scsi_get_ulpdev(dev);
@@ -1339,6 +1344,8 @@  void __scsi_remove_device(struct scsi_device *sdev)
 	 * device.
 	 */
 	scsi_device_set_state(sdev, SDEV_DEL);
+	WARN_ONCE(sdev->sdev_state != SDEV_DEL, "sdev state %d\n",
+		  sdev->sdev_state);
 	blk_cleanup_queue(sdev->request_queue);
 	cancel_work_sync(&sdev->requeue_work);

dm-mq and end_clone_request()

Commit Message

Comments

Patch