From patchwork Tue Sep 21 14:52:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Wiedmann X-Patchwork-Id: 12507995 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B035C433FE for ; Tue, 21 Sep 2021 14:52:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 76E8160F36 for ; Tue, 21 Sep 2021 14:52:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233793AbhIUOx6 (ORCPT ); Tue, 21 Sep 2021 10:53:58 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:35044 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S233784AbhIUOx5 (ORCPT ); Tue, 21 Sep 2021 10:53:57 -0400 Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 18LE9cJd011320; Tue, 21 Sep 2021 10:52:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=km1rTew9P7HfOTnAh0PTyHk+Mzy0LmoYrZdRyM0Mpn4=; b=mRzgpk2xL7V+NEdeJpBz/m7oZKE/VeA4T4teDnHMQ0xUH0eKh3Og9HzS5qWNKG+gCO7p H9+Bpnw0DS6Jj3xQWGPqEGYvPtTU2Kax0VJMHZ7TpZfzlPm2BQrmqTkDrBWJB8yEZ9FB CKk/4bDPzYv17uukBO9yuDFC95SsVNxcoMYPts4368nAPbydEjKdX9tmK2QBH/zsHwdK 8DupZWkWKBZ0Nw2w8gV+dfZyH7vOk440mSZhxnH4plqf2c2jkcZaokkEehM2Rbo8JLzh RnAjs8zG3Kz+bt2lbngNRHmu9fRXTBkbscIZh2ygvI3ZTlEGlf5vaMA5yAqATbPUPYC1 eQ== Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0b-001b2d01.pphosted.com with ESMTP id 3b7e4rxkvg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Sep 2021 10:52:25 -0400 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 18LEnEjD003777; Tue, 21 Sep 2021 14:52:23 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma06fra.de.ibm.com with ESMTP id 3b57cjnex4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Sep 2021 14:52:23 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 18LElYIE59441650 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 21 Sep 2021 14:47:34 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 29BA952067; Tue, 21 Sep 2021 14:52:19 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id C4C385204E; Tue, 21 Sep 2021 14:52:18 +0000 (GMT) From: Julian Wiedmann To: David Miller , Jakub Kicinski Cc: linux-netdev , linux-s390 , Heiko Carstens , Karsten Graul , Julian Wiedmann , Alexandra Winter , Stefan Raspl Subject: [PATCH net 1/3] s390/qeth: fix NULL deref in qeth_clear_working_pool_list() Date: Tue, 21 Sep 2021 16:52:15 +0200 Message-Id: <20210921145217.1584654-2-jwi@linux.ibm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210921145217.1584654-1-jwi@linux.ibm.com> References: <20210921145217.1584654-1-jwi@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: J7uUZVdeAlj9wIWBGhZdytcOnE61Pwhh X-Proofpoint-ORIG-GUID: J7uUZVdeAlj9wIWBGhZdytcOnE61Pwhh X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.391,FMLib:17.0.607.475 definitions=2021-09-21_04,2021-09-20_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 bulkscore=0 clxscore=1015 mlxlogscore=999 impostorscore=0 phishscore=0 lowpriorityscore=0 priorityscore=1501 malwarescore=0 suspectscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109030001 definitions=main-2109210089 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org When qeth_set_online() calls qeth_clear_working_pool_list() to roll back after an error exit from qeth_hardsetup_card(), we are at risk of accessing card->qdio.in_q before it was allocated by qeth_alloc_qdio_queues() via qeth_mpc_initialize(). qeth_clear_working_pool_list() then dereferences NULL, and by writing to queue->bufs[i].pool_entry scribbles all over the CPU's lowcore. Resulting in a crash when those lowcore areas are used next (eg. on the next machine-check interrupt). Such a scenario would typically happen when the device is first set online and its queues aren't allocated yet. An early IO error or certain misconfigs (eg. mismatched transport mode, bad portno) then cause us to error out from qeth_hardsetup_card() with card->qdio.in_q still being NULL. Fix it by checking the pointer for NULL before accessing it. Note that we also have (rare) paths inside qeth_mpc_initialize() where a configuration change can cause us to free the existing queues, expecting that subsequent code will allocate them again. If we then error out before that re-allocation happens, the same bug occurs. Fixes: eff73e16ee11 ("s390/qeth: tolerate pre-filled RX buffer") Reported-by: Stefan Raspl Root-caused-by: Heiko Carstens Signed-off-by: Julian Wiedmann Reviewed-by: Alexandra Winter --- drivers/s390/net/qeth_core_main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c index 41ca6273b750..3fba440a0731 100644 --- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -202,6 +202,9 @@ static void qeth_clear_working_pool_list(struct qeth_card *card) &card->qdio.in_buf_pool.entry_list, list) list_del(&pool_entry->list); + if (!queue) + return; + for (i = 0; i < ARRAY_SIZE(queue->bufs); i++) queue->bufs[i].pool_entry = NULL; } From patchwork Tue Sep 21 14:52:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Wiedmann X-Patchwork-Id: 12507999 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C791BC43217 for ; Tue, 21 Sep 2021 14:52:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B04CE6109E for ; Tue, 21 Sep 2021 14:52:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233812AbhIUOyB (ORCPT ); Tue, 21 Sep 2021 10:54:01 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:20502 "EHLO mx0b-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233798AbhIUOx7 (ORCPT ); Tue, 21 Sep 2021 10:53:59 -0400 Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 18LEKFSX037730; Tue, 21 Sep 2021 10:52:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=9dL6tQd4tAav7VDRjDMxUg/fUV3+1e2wrD6EQNW8ok8=; b=rgfkNtclXHzPTDnOmjkQYswBCOfRUyOQ2azCnWWbdOJ8HKLPynYPIXjTmQlfWQtjWHOO f5PKrP3HJu3fXhd7fIKiNBU6mN0OdOnJ6qyoeB4vzVQCWAXlzsj/lGq46+5w6WPpm8ST zLzXvqmRNkW34txcnpXXEUcgrfpEnB6clZnhLAss4rCIMwjDXFJ6aeZgSbtscDKQfo2t 9gRgH91I4Jpci9R6s6sGXWzeVFWKWtHH5GSTeTqiuZJTJQZk5OEmC+JmWv0COC0lUDkT ug7hwouhBEa6XCc0Ae6BJ0XjdqkGkHGoDKIJNzCI7t/VvKIm2ZQPofUTER+doom5/2SN gw== Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com with ESMTP id 3b7ek3nd36-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Sep 2021 10:52:27 -0400 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 18LEqFTd020123; Tue, 21 Sep 2021 14:52:25 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma04fra.de.ibm.com with ESMTP id 3b57r9nc7x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Sep 2021 14:52:25 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 18LElYON61473246 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 21 Sep 2021 14:47:34 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 88F645205A; Tue, 21 Sep 2021 14:52:19 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 388E552071; Tue, 21 Sep 2021 14:52:19 +0000 (GMT) From: Julian Wiedmann To: David Miller , Jakub Kicinski Cc: linux-netdev , linux-s390 , Heiko Carstens , Karsten Graul , Julian Wiedmann , Alexandra Winter Subject: [PATCH net 2/3] s390/qeth: Fix deadlock in remove_discipline Date: Tue, 21 Sep 2021 16:52:16 +0200 Message-Id: <20210921145217.1584654-3-jwi@linux.ibm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210921145217.1584654-1-jwi@linux.ibm.com> References: <20210921145217.1584654-1-jwi@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 3C_e2yalCjallge8bfe9v5EjJmOh3vjW X-Proofpoint-ORIG-GUID: 3C_e2yalCjallge8bfe9v5EjJmOh3vjW X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.391,FMLib:17.0.607.475 definitions=2021-09-21_04,2021-09-20_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 suspectscore=0 impostorscore=0 lowpriorityscore=0 spamscore=0 mlxscore=0 adultscore=0 mlxlogscore=999 bulkscore=0 clxscore=1015 malwarescore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109030001 definitions=main-2109210089 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Alexandra Winter Problem: qeth_close_dev_handler is a worker that tries to acquire card->discipline_mutex via drv->set_offline() in ccwgroup_set_offline(). Since commit b41b554c1ee7 ("s390/qeth: fix locking for discipline setup / removal") qeth_remove_discipline() is called under card->discipline_mutex and cancels the work and waits for it to finish. STOPLAN reception with reason code IPA_RC_VEPA_TO_VEB_TRANSITION is the only situation that schedules close_dev_work. In that situation scheduling qeth recovery will also result in an offline interface, when resetting the isolation mode fails, if the external switch is still set to VEB. And since commit 0b9902c1fcc5 ("s390/qeth: fix deadlock during recovery") qeth recovery does not aquire card->discipline_mutex anymore. So we accept the longer pathlength of qeth_schedule_recovery in this error situation and re-use the existing function. As a side-benefit this changes the hwtrap to behave like during recovery instead of like during a user-triggered set_offline. Fixes: b41b554c1ee7 ("s390/qeth: fix locking for discipline setup / removal") Signed-off-by: Alexandra Winter Acked-by: Julian Wiedmann Signed-off-by: Julian Wiedmann --- drivers/s390/net/qeth_core.h | 1 - drivers/s390/net/qeth_core_main.c | 16 ++++------------ drivers/s390/net/qeth_l2_main.c | 1 - drivers/s390/net/qeth_l3_main.c | 1 - 4 files changed, 4 insertions(+), 15 deletions(-) diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h index 535a60b3946d..a5aa0bdc61d6 100644 --- a/drivers/s390/net/qeth_core.h +++ b/drivers/s390/net/qeth_core.h @@ -858,7 +858,6 @@ struct qeth_card { struct napi_struct napi; struct qeth_rx rx; struct delayed_work buffer_reclaim_work; - struct work_struct close_dev_work; }; static inline bool qeth_card_hw_is_reachable(struct qeth_card *card) diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c index 3fba440a0731..9f26706051e5 100644 --- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -70,15 +70,6 @@ static void qeth_issue_next_read_cb(struct qeth_card *card, static int qeth_qdio_establish(struct qeth_card *); static void qeth_free_qdio_queues(struct qeth_card *card); -static void qeth_close_dev_handler(struct work_struct *work) -{ - struct qeth_card *card; - - card = container_of(work, struct qeth_card, close_dev_work); - QETH_CARD_TEXT(card, 2, "cldevhdl"); - ccwgroup_set_offline(card->gdev); -} - static const char *qeth_get_cardname(struct qeth_card *card) { if (IS_VM_NIC(card)) { @@ -795,10 +786,12 @@ static struct qeth_ipa_cmd *qeth_check_ipa_data(struct qeth_card *card, case IPA_CMD_STOPLAN: if (cmd->hdr.return_code == IPA_RC_VEPA_TO_VEB_TRANSITION) { dev_err(&card->gdev->dev, - "Interface %s is down because the adjacent port is no longer in reflective relay mode\n", + "Adjacent port of interface %s is no longer in reflective relay mode, trigger recovery\n", netdev_name(card->dev)); - schedule_work(&card->close_dev_work); + /* Set offline, then probably fail to set online: */ + qeth_schedule_recovery(card); } else { + /* stay online for subsequent STARTLAN */ dev_warn(&card->gdev->dev, "The link for interface %s on CHPID 0x%X failed\n", netdev_name(card->dev), card->info.chpid); @@ -1540,7 +1533,6 @@ static void qeth_setup_card(struct qeth_card *card) INIT_LIST_HEAD(&card->ipato.entries); qeth_init_qdio_info(card); INIT_DELAYED_WORK(&card->buffer_reclaim_work, qeth_buffer_reclaim_work); - INIT_WORK(&card->close_dev_work, qeth_close_dev_handler); hash_init(card->rx_mode_addrs); hash_init(card->local_addrs4); hash_init(card->local_addrs6); diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c index 72e84ff9fea5..dc6c00768d91 100644 --- a/drivers/s390/net/qeth_l2_main.c +++ b/drivers/s390/net/qeth_l2_main.c @@ -2307,7 +2307,6 @@ static void qeth_l2_remove_device(struct ccwgroup_device *gdev) if (gdev->state == CCWGROUP_ONLINE) qeth_set_offline(card, card->discipline, false); - cancel_work_sync(&card->close_dev_work); if (card->dev->reg_state == NETREG_REGISTERED) { priv = netdev_priv(card->dev); if (priv->brport_features & BR_LEARNING_SYNC) { diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c index 3a523e700a5a..6fd3e288f059 100644 --- a/drivers/s390/net/qeth_l3_main.c +++ b/drivers/s390/net/qeth_l3_main.c @@ -1969,7 +1969,6 @@ static void qeth_l3_remove_device(struct ccwgroup_device *cgdev) if (cgdev->state == CCWGROUP_ONLINE) qeth_set_offline(card, card->discipline, false); - cancel_work_sync(&card->close_dev_work); if (card->dev->reg_state == NETREG_REGISTERED) unregister_netdev(card->dev); From patchwork Tue Sep 21 14:52:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Wiedmann X-Patchwork-Id: 12508001 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45C8AC433EF for ; Tue, 21 Sep 2021 14:52:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2E61A6109E for ; Tue, 21 Sep 2021 14:52:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233818AbhIUOyE (ORCPT ); Tue, 21 Sep 2021 10:54:04 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:45236 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233784AbhIUOx7 (ORCPT ); Tue, 21 Sep 2021 10:53:59 -0400 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 18LE6DjV001549; Tue, 21 Sep 2021 10:52:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=eAIpzbHKu/gMjgR0k6JGM4Tc90EDXllQhMtwTVJyCwc=; b=c2AIvgI1A2Rz5pm46RkinLIv/LKkLdZvBMSb8nDZe/PF4tmhBAQm5U6Ph/0Wu6DRvaA3 kZwUJBhPHlo5vT3vq1ob9uL2px9mYf4V4umgQ2MYedfmzY2AKg+Y4UdZMy8HVQEwrPjN FCrSnbwCudhv8CZzFREaJElSIAoLlvhg2rzF8a7zJQNhe9tI08VAottP+uCripT9ijUP WVu5cxrkX4BjOSKS22cLtLJ11SEgTRmLeqMWfkyBqxdh0W/BsS7vShJeAVdLXPY44ANM ArOG3r/tDpskpK/WVkz9foBdXXRYfZIxtWpTTuN73Rxkbv34AHc4jXFjvirVs5EUGNd1 SQ== Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com with ESMTP id 3b7f15mxfq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Sep 2021 10:52:29 -0400 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 18LEqFTh020123; Tue, 21 Sep 2021 14:52:27 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma04fra.de.ibm.com with ESMTP id 3b57r9nc88-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Sep 2021 14:52:26 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 18LEqKDO43712858 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 21 Sep 2021 14:52:20 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E835C52050; Tue, 21 Sep 2021 14:52:19 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 98A5952077; Tue, 21 Sep 2021 14:52:19 +0000 (GMT) From: Julian Wiedmann To: David Miller , Jakub Kicinski Cc: linux-netdev , linux-s390 , Heiko Carstens , Karsten Graul , Julian Wiedmann , Alexandra Winter Subject: [PATCH net 3/3] s390/qeth: fix deadlock during failing recovery Date: Tue, 21 Sep 2021 16:52:17 +0200 Message-Id: <20210921145217.1584654-4-jwi@linux.ibm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210921145217.1584654-1-jwi@linux.ibm.com> References: <20210921145217.1584654-1-jwi@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: MOnMHb89gZ9Lci1VkJPKdqaFt4wjj-V- X-Proofpoint-GUID: MOnMHb89gZ9Lci1VkJPKdqaFt4wjj-V- X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.391,FMLib:17.0.607.475 definitions=2021-09-21_04,2021-09-20_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 mlxlogscore=999 mlxscore=0 phishscore=0 impostorscore=0 spamscore=0 priorityscore=1501 suspectscore=0 bulkscore=0 lowpriorityscore=0 clxscore=1015 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109030001 definitions=main-2109210089 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Alexandra Winter Commit 0b9902c1fcc5 ("s390/qeth: fix deadlock during recovery") removed taking discipline_mutex inside qeth_do_reset(), fixing potential deadlocks. An error path was missed though, that still takes discipline_mutex and thus has the original deadlock potential. Intermittent deadlocks were seen when a qeth channel path is configured offline, causing a race between qeth_do_reset and ccwgroup_remove. Call qeth_set_offline() directly in the qeth_do_reset() error case and then a new variant of ccwgroup_set_offline(), without taking discipline_mutex. Fixes: b41b554c1ee7 ("s390/qeth: fix locking for discipline setup / removal") Signed-off-by: Alexandra Winter Reviewed-by: Julian Wiedmann Signed-off-by: Julian Wiedmann --- arch/s390/include/asm/ccwgroup.h | 2 +- drivers/s390/cio/ccwgroup.c | 10 ++++++++-- drivers/s390/net/qeth_core_main.c | 3 ++- 3 files changed, 11 insertions(+), 4 deletions(-) diff --git a/arch/s390/include/asm/ccwgroup.h b/arch/s390/include/asm/ccwgroup.h index 36dbf5043fc0..aa995d91cd1d 100644 --- a/arch/s390/include/asm/ccwgroup.h +++ b/arch/s390/include/asm/ccwgroup.h @@ -55,7 +55,7 @@ int ccwgroup_create_dev(struct device *root, struct ccwgroup_driver *gdrv, int num_devices, const char *buf); extern int ccwgroup_set_online(struct ccwgroup_device *gdev); -extern int ccwgroup_set_offline(struct ccwgroup_device *gdev); +int ccwgroup_set_offline(struct ccwgroup_device *gdev, bool call_gdrv); extern int ccwgroup_probe_ccwdev(struct ccw_device *cdev); extern void ccwgroup_remove_ccwdev(struct ccw_device *cdev); diff --git a/drivers/s390/cio/ccwgroup.c b/drivers/s390/cio/ccwgroup.c index 2ec741106cb6..f0538609dfe4 100644 --- a/drivers/s390/cio/ccwgroup.c +++ b/drivers/s390/cio/ccwgroup.c @@ -77,12 +77,13 @@ EXPORT_SYMBOL(ccwgroup_set_online); /** * ccwgroup_set_offline() - disable a ccwgroup device * @gdev: target ccwgroup device + * @call_gdrv: Call the registered gdrv set_offline function * * This function attempts to put the ccwgroup device into the offline state. * Returns: * %0 on success and a negative error value on failure. */ -int ccwgroup_set_offline(struct ccwgroup_device *gdev) +int ccwgroup_set_offline(struct ccwgroup_device *gdev, bool call_gdrv) { struct ccwgroup_driver *gdrv = to_ccwgroupdrv(gdev->dev.driver); int ret = -EINVAL; @@ -91,11 +92,16 @@ int ccwgroup_set_offline(struct ccwgroup_device *gdev) return -EAGAIN; if (gdev->state == CCWGROUP_OFFLINE) goto out; + if (!call_gdrv) { + ret = 0; + goto offline; + } if (gdrv->set_offline) ret = gdrv->set_offline(gdev); if (ret) goto out; +offline: gdev->state = CCWGROUP_OFFLINE; out: atomic_set(&gdev->onoff, 0); @@ -124,7 +130,7 @@ static ssize_t ccwgroup_online_store(struct device *dev, if (value == 1) ret = ccwgroup_set_online(gdev); else if (value == 0) - ret = ccwgroup_set_offline(gdev); + ret = ccwgroup_set_offline(gdev, true); else ret = -EINVAL; out: diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c index 9f26706051e5..e9807d2996a9 100644 --- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -5514,7 +5514,8 @@ static int qeth_do_reset(void *data) dev_info(&card->gdev->dev, "Device successfully recovered!\n"); } else { - ccwgroup_set_offline(card->gdev); + qeth_set_offline(card, disc, true); + ccwgroup_set_offline(card->gdev, false); dev_warn(&card->gdev->dev, "The qeth device driver failed to recover an error on the device\n"); }