From patchwork Fri Dec 8 09:42:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jason Yan X-Patchwork-Id: 10102171 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1264360325 for ; Fri, 8 Dec 2017 10:05:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E6FF028AE1 for ; Fri, 8 Dec 2017 10:05:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DB60A28AF7; Fri, 8 Dec 2017 10:05:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0FE3728AE1 for ; Fri, 8 Dec 2017 10:05:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753478AbdLHKFM (ORCPT ); Fri, 8 Dec 2017 05:05:12 -0500 Received: from szxga05-in.huawei.com ([45.249.212.191]:2270 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753371AbdLHJhW (ORCPT ); Fri, 8 Dec 2017 04:37:22 -0500 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id BB2723CA990E; Fri, 8 Dec 2017 17:36:29 +0800 (CST) Received: from huawei.com (10.175.124.28) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.361.1; Fri, 8 Dec 2017 17:36:23 +0800 From: Jason Yan To: , CC: , , , , , , , , , , , , Jason Yan , Ewan Milne , Tomas Henzl Subject: [PATCH v5 6/7] scsi: libsas: direct call probe and destruct Date: Fri, 8 Dec 2017 17:42:09 +0800 Message-ID: <20171208094210.24887-7-yanaijie@huawei.com> X-Mailer: git-send-email 2.9.5 In-Reply-To: <20171208094210.24887-1-yanaijie@huawei.com> References: <20171208094210.24887-1-yanaijie@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.124.28] X-CFilter-Loop: Reflected Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In commit 87c8331fcf72 ("[SCSI] libsas: prevent domain rediscovery competing with ata error handling") introduced disco mutex to prevent rediscovery competing with ata error handling and put the whole revalidation in the mutex. But the rphy add/remove needs to wait for the error handling which also grabs the disco mutex. This may leads to dead lock.So the probe and destruct event were introduce to do the rphy add/remove asynchronously and out of the lock. The asynchronously processed workers makes the whole discovery process not atomic, the other events may interrupt the process. For example, if a loss of signal event inserted before the probe event, the sas_deform_port() is called and the port will be deleted. And sas_port_delete() may run before the destruct event, but the port-x:x is the top parent of end device or expander. This leads to a kernel WARNING such as: [ 82.042979] sysfs group 'power' not found for kobject 'phy-1:0:22' [ 82.042983] ------------[ cut here ]------------ [ 82.042986] WARNING: CPU: 54 PID: 1714 at fs/sysfs/group.c:237 sysfs_remove_group+0x94/0xa0 [ 82.043059] Call trace: [ 82.043082] [] sysfs_remove_group+0x94/0xa0 [ 82.043085] [] dpm_sysfs_remove+0x60/0x70 [ 82.043086] [] device_del+0x138/0x308 [ 82.043089] [] sas_phy_delete+0x38/0x60 [ 82.043091] [] do_sas_phy_delete+0x6c/0x80 [ 82.043093] [] device_for_each_child+0x58/0xa0 [ 82.043095] [] sas_remove_children+0x40/0x50 [ 82.043100] [] sas_destruct_devices+0x64/0xa0 [ 82.043102] [] process_one_work+0x1fc/0x4b0 [ 82.043104] [] worker_thread+0x50/0x490 [ 82.043105] [] kthread+0xfc/0x128 [ 82.043107] [] ret_from_fork+0x10/0x50 Make probe and destruct a direct call in the disco and revalidate function, but put them outside the lock. The whole discovery or revalidate won't be interrupted by other events. And the DISCE_PROBE and DISCE_DESTRUCT event are deleted as a result of the direct call. Introduce a new list to destruct the sas_port and put the port delete after the destruct. This makes sure the right order of destroying the sysfs kobject and fix the warning above. In sas_ex_revalidate_domain() have a loop to find all broadcasted device, and sometimes we have a chance to find the same expander twice. Because the sas_port will be deleted at the end of the whole revalidate process, sas_port with the same name cannot be added before this. Otherwise the sysfs will complain of creating duplicate filename. Since the LLDD will send broadcast for every device change, we can only process one expander's revalidation. Signed-off-by: Jason Yan CC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl CC: Dan Williams Signed-off-by: Hannes Reinecke --- drivers/scsi/libsas/sas_ata.c | 1 - drivers/scsi/libsas/sas_discover.c | 32 ++++++++++++++++++-------------- drivers/scsi/libsas/sas_expander.c | 8 +++----- drivers/scsi/libsas/sas_internal.h | 1 + drivers/scsi/libsas/sas_port.c | 3 +++ include/scsi/libsas.h | 3 +-- include/scsi/scsi_transport_sas.h | 1 + 7 files changed, 27 insertions(+), 22 deletions(-) diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c index 70be442..2b3637b 100644 --- a/drivers/scsi/libsas/sas_ata.c +++ b/drivers/scsi/libsas/sas_ata.c @@ -730,7 +730,6 @@ int sas_discover_sata(struct domain_device *dev) if (res) return res; - sas_discover_event(dev->port, DISCE_PROBE); return 0; } diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c index 14f714d..190108e 100644 --- a/drivers/scsi/libsas/sas_discover.c +++ b/drivers/scsi/libsas/sas_discover.c @@ -212,13 +212,9 @@ void sas_notify_lldd_dev_gone(struct domain_device *dev) } } -static void sas_probe_devices(struct work_struct *work) +static void sas_probe_devices(struct asd_sas_port *port) { struct domain_device *dev, *n; - struct sas_discovery_event *ev = to_sas_discovery_event(work); - struct asd_sas_port *port = ev->port; - - clear_bit(DISCE_PROBE, &port->disc.pending); /* devices must be domain members before link recovery and probe */ list_for_each_entry(dev, &port->disco_list, disco_list_node) { @@ -294,7 +290,6 @@ int sas_discover_end_dev(struct domain_device *dev) res = sas_notify_lldd_dev_found(dev); if (res) return res; - sas_discover_event(dev->port, DISCE_PROBE); return 0; } @@ -353,13 +348,9 @@ static void sas_unregister_common_dev(struct asd_sas_port *port, struct domain_d sas_put_device(dev); } -static void sas_destruct_devices(struct work_struct *work) +void sas_destruct_devices(struct asd_sas_port *port) { struct domain_device *dev, *n; - struct sas_discovery_event *ev = to_sas_discovery_event(work); - struct asd_sas_port *port = ev->port; - - clear_bit(DISCE_DESTRUCT, &port->disc.pending); list_for_each_entry_safe(dev, n, &port->destroy_list, disco_list_node) { list_del_init(&dev->disco_list_node); @@ -370,6 +361,16 @@ static void sas_destruct_devices(struct work_struct *work) } } +void sas_destruct_ports(struct asd_sas_port *port) +{ + struct sas_port *sas_port, *p; + + list_for_each_entry_safe(sas_port, p, &port->sas_port_del_list, del_list) { + list_del_init(&sas_port->del_list); + sas_port_delete(sas_port); + } +} + void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev) { if (!test_bit(SAS_DEV_DESTROY, &dev->state) && @@ -384,7 +385,6 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev) if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) { sas_rphy_unlink(dev->rphy); list_move_tail(&dev->disco_list_node, &port->destroy_list); - sas_discover_event(dev->port, DISCE_DESTRUCT); } } @@ -490,6 +490,8 @@ static void sas_discover_domain(struct work_struct *work) port->port_dev = NULL; } + sas_probe_devices(port); + SAS_DPRINTK("DONE DISCOVERY on port %d, pid:%d, result:%d\n", port->id, task_pid_nr(current), error); } @@ -523,6 +525,10 @@ static void sas_revalidate_domain(struct work_struct *work) port->id, task_pid_nr(current), res); out: mutex_unlock(&ha->disco_mutex); + + sas_destruct_devices(port); + sas_destruct_ports(port); + sas_probe_devices(port); } /* ---------- Events ---------- */ @@ -578,10 +584,8 @@ void sas_init_disc(struct sas_discovery *disc, struct asd_sas_port *port) static const work_func_t sas_event_fns[DISC_NUM_EVENTS] = { [DISCE_DISCOVER_DOMAIN] = sas_discover_domain, [DISCE_REVALIDATE_DOMAIN] = sas_revalidate_domain, - [DISCE_PROBE] = sas_probe_devices, [DISCE_SUSPEND] = sas_suspend_devices, [DISCE_RESUME] = sas_resume_devices, - [DISCE_DESTRUCT] = sas_destruct_devices, }; disc->pending = 0; diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c index ca15662..50cb0f3 100644 --- a/drivers/scsi/libsas/sas_expander.c +++ b/drivers/scsi/libsas/sas_expander.c @@ -1914,7 +1914,8 @@ static void sas_unregister_devs_sas_addr(struct domain_device *parent, sas_port_delete_phy(phy->port, phy->phy); sas_device_set_phy(found, phy->port); if (phy->port->num_phys == 0) - sas_port_delete(phy->port); + list_add_tail(&phy->port->del_list, + &parent->port->sas_port_del_list); phy->port = NULL; } } @@ -2122,7 +2123,7 @@ int sas_ex_revalidate_domain(struct domain_device *port_dev) struct domain_device *dev = NULL; res = sas_find_bcast_dev(port_dev, &dev); - while (res == 0 && dev) { + if (res == 0 && dev) { struct expander_device *ex = &dev->ex_dev; int i = 0, phy_id; @@ -2134,9 +2135,6 @@ int sas_ex_revalidate_domain(struct domain_device *port_dev) res = sas_rediscover(dev, phy_id); i = phy_id + 1; } while (i < ex->num_phys); - - dev = NULL; - res = sas_find_bcast_dev(port_dev, &dev); } return res; } diff --git a/drivers/scsi/libsas/sas_internal.h b/drivers/scsi/libsas/sas_internal.h index d8826a7..50e12d6 100644 --- a/drivers/scsi/libsas/sas_internal.h +++ b/drivers/scsi/libsas/sas_internal.h @@ -101,6 +101,7 @@ int sas_try_ata_reset(struct asd_sas_phy *phy); void sas_hae_reset(struct work_struct *work); void sas_free_device(struct kref *kref); +void sas_destruct_devices(struct asd_sas_port *port); extern const work_func_t sas_phy_event_fns[PHY_NUM_EVENTS]; extern const work_func_t sas_port_event_fns[PORT_NUM_EVENTS]; diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c index 64722f4..f07e55d 100644 --- a/drivers/scsi/libsas/sas_port.c +++ b/drivers/scsi/libsas/sas_port.c @@ -66,6 +66,7 @@ static void sas_resume_port(struct asd_sas_phy *phy) rc = sas_notify_lldd_dev_found(dev); if (rc) { sas_unregister_dev(port, dev); + sas_destruct_devices(port); continue; } @@ -220,6 +221,7 @@ void sas_deform_port(struct asd_sas_phy *phy, int gone) if (port->num_phys == 1) { sas_unregister_domain_devices(port, gone); + sas_destruct_devices(port); sas_port_delete(port->port); port->port = NULL; } else { @@ -317,6 +319,7 @@ static void sas_init_port(struct asd_sas_port *port, INIT_LIST_HEAD(&port->dev_list); INIT_LIST_HEAD(&port->disco_list); INIT_LIST_HEAD(&port->destroy_list); + INIT_LIST_HEAD(&port->sas_port_del_list); spin_lock_init(&port->phy_list_lock); INIT_LIST_HEAD(&port->phy_list); port->ha = sas_ha; diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h index fa93c41..225ab77 100644 --- a/include/scsi/libsas.h +++ b/include/scsi/libsas.h @@ -82,10 +82,8 @@ enum phy_event { enum discover_event { DISCE_DISCOVER_DOMAIN = 0U, DISCE_REVALIDATE_DOMAIN, - DISCE_PROBE, DISCE_SUSPEND, DISCE_RESUME, - DISCE_DESTRUCT, DISC_NUM_EVENTS, }; @@ -262,6 +260,7 @@ struct asd_sas_port { struct list_head dev_list; struct list_head disco_list; struct list_head destroy_list; + struct list_head sas_port_del_list; enum sas_linkrate linkrate; struct sas_work work; diff --git a/include/scsi/scsi_transport_sas.h b/include/scsi/scsi_transport_sas.h index 62895b4..05ec927 100644 --- a/include/scsi/scsi_transport_sas.h +++ b/include/scsi/scsi_transport_sas.h @@ -156,6 +156,7 @@ struct sas_port { struct mutex phy_list_mutex; struct list_head phy_list; + struct list_head del_list; /* libsas only */ }; #define dev_to_sas_port(d) \