From patchwork Mon Mar 28 18:21:04 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Laurence Oberman X-Patchwork-Id: 8680221 Return-Path: X-Original-To: patchwork-linux-scsi@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 578C29F30C for ; Mon, 28 Mar 2016 18:21:45 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 4247320274 for ; Mon, 28 Mar 2016 18:21:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D2BEF20272 for ; Mon, 28 Mar 2016 18:21:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751544AbcC1SVm (ORCPT ); Mon, 28 Mar 2016 14:21:42 -0400 Received: from mx3-phx2.redhat.com ([209.132.183.24]:40043 "EHLO mx3-phx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751212AbcC1SVl (ORCPT ); Mon, 28 Mar 2016 14:21:41 -0400 Received: from zmail22.collab.prod.int.phx2.redhat.com (zmail22.collab.prod.int.phx2.redhat.com [10.5.83.26]) by mx3-phx2.redhat.com (8.13.8/8.13.8) with ESMTP id u2SIL560007554; Mon, 28 Mar 2016 14:21:05 -0400 Date: Mon, 28 Mar 2016 14:21:04 -0400 (EDT) From: Laurence Oberman To: Bart Van Assche Cc: James Bottomley , "Martin K. Petersen" , Hannes Reinecke , Christoph Hellwig , linux-scsi@vger.kernel.org Message-ID: <933006845.25395115.1459189264684.JavaMail.zimbra@redhat.com> In-Reply-To: <56F9746C.40101@sandisk.com> References: <56F9740A.4060100@sandisk.com> <56F9746C.40101@sandisk.com> Subject: Re: [PATCH 2/2] scsi_dh_alua: Fix a recently introduced deadlock MIME-Version: 1.0 X-Originating-IP: [10.18.49.29] X-Mailer: Zimbra 8.0.6_GA_5922 (ZimbraWebClient - FF38 (Linux)/8.0.6_GA_5922) Thread-Topic: scsi_dh_alua: Fix a recently introduced deadlock Thread-Index: IZApPCOuF7zshYod8cU+12wbXOXFpg== Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In similar testing last month with mlx4_ib I was able to generate the same lockup so this patch will help. If I get a chance will add a Tested-by this week. Reviewed-by: Laurence Oberman Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services ----- Original Message ----- From: "Bart Van Assche" To: "James Bottomley" , "Martin K. Petersen" Cc: "Hannes Reinecke" , "Christoph Hellwig" , linux-scsi@vger.kernel.org Sent: Monday, March 28, 2016 2:14:04 PM Subject: [PATCH 2/2] scsi_dh_alua: Fix a recently introduced deadlock While retesting the SRP initiator I ran the command "rmmod mlx4_ib" while I/O was in progress. That command triggers SCSI device removal indirectly. Avoid that this action triggers the following deadlock: ================================= [ INFO: inconsistent lock state ] 4.6.0-rc0-dbg+ #2 Tainted: G O --------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. multipathd/484 [HC0[0]:SC0[0]:HE1:SE1] takes: (&(&pg->lock)->rlock){+.?...}, at: [] alua_bus_detach+0x52/0xa0 [scsi_dh_alua] {IN-SOFTIRQ-W} state was registered at: [] __lock_acquire+0x7e9/0x1ad0 [] lock_acquire+0x60/0x80 [] _raw_spin_lock_irqsave+0x3e/0x60 [] alua_rtpg_queue+0x41/0x1d0 [scsi_dh_alua] [] alua_check+0xe1/0x220 [scsi_dh_alua] [] alua_check_sense+0x99/0xb0 [scsi_dh_alua] [] scsi_check_sense+0x71/0x3f0 [] scsi_decide_disposition+0x18b/0x1d0 [] scsi_softirq_done+0x52/0x140 [] blk_done_softirq+0x52/0x90 [] __do_softirq+0x10f/0x230 [] irq_exit+0xa8/0xb0 [] do_IRQ+0x65/0x110 [] ret_from_intr+0x0/0x19 [] kmem_cache_alloc+0x151/0x190 [] create_object+0x34/0x2d0 [] kmemleak_alloc_percpu+0x56/0xd0 [] pcpu_alloc+0x38d/0x660 [] __alloc_percpu_gfp+0xd/0x10 [] __percpu_counter_init+0x55/0xb0 [] blkg_alloc+0x79/0x230 [] blkcg_init_queue+0x26/0x1d0 [] blk_alloc_queue_node+0x27d/0x2e0 [] dm_create+0x20c/0x570 [dm_mod] [] dev_create+0x56/0x2c0 [dm_mod] [] ctl_ioctl+0x26e/0x520 [dm_mod] [] dm_ctl_ioctl+0xe/0x20 [dm_mod] [] do_vfs_ioctl+0x8e/0x660 [] SyS_ioctl+0x3c/0x70 [] entry_SYSCALL_64_fastpath+0x1c/0xac irq event stamp: 4290931 hardirqs last enabled at (4290931): [ 1662.892772] [] _raw_spin_unlock_irqrestore+0x31/0x50 hardirqs last disabled at (4290930): [] _raw_spin_lock_irqsave+0x17/0x60 softirqs last enabled at (4290774): [] __do_softirq+0x1cb/0x230 softirqs last disabled at (4289831): [] irq_exit+0xa8/0xb0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&pg->lock)->rlock); lock(&(&pg->lock)->rlock); *** DEADLOCK *** 2 locks held by multipathd/484: #0: (&bdev->bd_mutex){+.+.+.}, at: [] __blkdev_put+0x33/0x360 #1: (sd_ref_mutex){+.+...}, at: [] scsi_disk_put+0x1c/0x40 stack backtrace: CPU: 6 PID: 484 Comm: multipathd Tainted: G O 4.6.0-rc0-dbg+ #2 Call Trace: [] dump_stack+0x67/0x92 [] print_usage_bug+0x215/0x240 [] mark_lock+0x54a/0x610 [] __lock_acquire+0x845/0x1ad0 [] lock_acquire+0x60/0x80 [] _raw_spin_lock+0x33/0x50 [] alua_bus_detach+0x52/0xa0 [scsi_dh_alua] [] scsi_dh_release_device+0x17/0x50 [] scsi_device_dev_release_usercontext+0x2a/0x120 [] execute_in_process_context+0x80/0x90 [] scsi_device_dev_release+0x17/0x20 [] device_release+0x2d/0x90 [] kobject_release+0x7a/0x190 [] kobject_put+0x26/0x50 [] put_device+0x12/0x20 [] scsi_device_put+0x26/0x30 [] scsi_disk_put+0x2d/0x40 [] sd_release+0x48/0xb0 [] __blkdev_put+0x29e/0x360 [] blkdev_put+0x49/0x170 [] blkdev_close+0x20/0x30 [] __fput+0xe8/0x1f0 [] ____fput+0x9/0x10 [] task_work_run+0x6e/0xa0 [] exit_to_usermode_loop+0xa9/0xb0 [] syscall_return_slowpath+0xb0/0xc0 [] entry_SYSCALL_64_fastpath+0xaa/0xac Fixes: cb0a168cb6b8 (scsi_dh_alua: update 'access_state' field) Signed-off-by: Bart Van Assche Cc: Hannes Reinecke Signed-off-by: Bart Van Assche --- drivers/scsi/device_handler/scsi_dh_alua.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index a404a41..8eaed05 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -1112,9 +1112,9 @@ static void alua_bus_detach(struct scsi_device *sdev) h->sdev = NULL; spin_unlock(&h->pg_lock); if (pg) { - spin_lock(&pg->lock); + spin_lock_irq(&pg->lock); list_del_rcu(&h->node); - spin_unlock(&pg->lock); + spin_unlock_irq(&pg->lock); kref_put(&pg->kref, release_port_group); } sdev->handler_data = NULL;