From patchwork Thu Aug 3 20:20:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Child X-Patchwork-Id: 13340823 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 253BE263B2 for ; Thu, 3 Aug 2023 20:20:16 +0000 (UTC) Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 521B0420F for ; Thu, 3 Aug 2023 13:20:15 -0700 (PDT) Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 373KB6bd010167 for ; Thu, 3 Aug 2023 20:20:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding; s=pp1; bh=PLRASjmceMGK0zRbNtV9kmaVXwhQxJM0rzYSEi5nLa0=; b=fmbW34U0h5BDtxTvHdNakH9AZTyy9lE+xL/PZ0EBE5xR43aLqGDyhoBRQVlxlybonS1Z RXpVoi15fNsNyUTN3AnpDB6TSF4fxcGBFwjwKtinTR6Wk1EafAQMGvsSuahPWaS6gCFE Km1IkISdzePLhWT53e8YC9I81gm33BPIHROOc0Dxjb5YFoheeJhm9KAJuDVeW7fqotBR hiebTOFmZ18HbvEYMqU/4KeCEjVgvBQTvGjNV7Yppk1sQBMEIivdd76LdwBGSQuKQvDJ bepolzNMSdvyjgUP1/CHrzGDqAQpPEIXXjhJamFlgnqMHAtn2Ca2WkK+oQATa0HFG0An LA== Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3s8k23gmkv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 03 Aug 2023 20:20:14 +0000 Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 373KFvlC018903 for ; Thu, 3 Aug 2023 20:20:13 GMT Received: from smtprelay03.wdc07v.mail.ibm.com ([172.16.1.70]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3s5ekm0d25-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 03 Aug 2023 20:20:13 +0000 Received: from smtpav02.dal12v.mail.ibm.com (smtpav02.dal12v.mail.ibm.com [10.241.53.101]) by smtprelay03.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 373KKBZD3277400 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 3 Aug 2023 20:20:11 GMT Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8572658051; Thu, 3 Aug 2023 20:20:11 +0000 (GMT) Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5E4605805E; Thu, 3 Aug 2023 20:20:11 +0000 (GMT) Received: from li-8d37cfcc-31b9-11b2-a85c-83226d7135c9.austin.ibm.com (unknown [9.24.4.46]) by smtpav02.dal12v.mail.ibm.com (Postfix) with ESMTP; Thu, 3 Aug 2023 20:20:11 +0000 (GMT) From: Nick Child To: netdev@vger.kernel.org Cc: haren@linux.ibm.com, ricklind@us.ibm.com, danymadden@us.ibm.com, tlfalcon@linux.ibm.com, bjking1@linux.ibm.com, Nick Child Subject: [PATCH net 1/5] ibmvnic: Enforce stronger sanity checks on login response Date: Thu, 3 Aug 2023 15:20:06 -0500 Message-Id: <20230803202010.37149-1-nnac123@linux.ibm.com> X-Mailer: git-send-email 2.39.3 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: Z68cHGX8lwCI1bPEGh25AcV2jydWZbPE X-Proofpoint-ORIG-GUID: Z68cHGX8lwCI1bPEGh25AcV2jydWZbPE X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-08-03_22,2023-08-03_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 malwarescore=0 phishscore=0 clxscore=1011 mlxscore=0 lowpriorityscore=0 impostorscore=0 bulkscore=0 adultscore=0 mlxlogscore=999 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308030180 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org Ensure that all offsets in a login response buffer are within the size of the allocated response buffer. Any offsets or lengths that surpass the allocation are likely the result of an incomplete response buffer. In these cases, a full reset is necessary. When attempting to login, the ibmvnic device will allocate a response buffer and pass a reference to the VIOS. The VIOS will then send the ibmvnic device a LOGIN_RSP CRQ to signal that the buffer has been filled with data. If the ibmvnic device does not get a response in 20 seconds, the old buffer is freed and a new login request is sent. With 2 outstanding requests, any LOGIN_RSP CRQ's could be for the older login request. If this is the case then the login response buffer (which is for the newer login request) could be incomplete and contain invalid data. Therefore, we must enforce strict sanity checks on the response buffer values. Testing has shown that the `off_rxadd_buff_size` value is filled in last by the VIOS and will be the smoking gun for these circumstances. Until VIOS can implement a mechanism for tracking outstanding response buffers and a method for mapping a LOGIN_RSP CRQ to a particular login response buffer, the best ibmvnic can do in this situation is perform a full reset. Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child Reviewed-by: Simon Horman --- Hello! This patchset is all relevant to recent bugs which came up regarding the ibmvnic login process. Specifically, when this process times out. ibmvnic devices are virtual devices which need to "login" to a physical NIC at the end of its initialization process. This invloves sending a command to the VIOS (virtual input output server, essentially the server that this client is logging into) requesting it to fill out a DMA mapped repsonse buffer. Once done, the VIOS sends a response informing the client that the buffer has been filled with data. If the VIOS does not send a response in 20 seconds then the client tries again. If this happens then several bugs can occur. This is usually due to the fact that there are more than one outstanding requests and no mechanism for mapping a response CRQ to a given response buffer. Until that mechanism is created, this patchset aims to harden this timeout recovery process so that the device does not get stuck in an inopperable state. drivers/net/ethernet/ibm/ibmvnic.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 763d613adbcc..996f8037c266 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -5397,6 +5397,7 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq, int num_rx_pools; u64 *size_array; int i; + u32 rsp_len; /* CHECK: Test/set of login_pending does not need to be atomic * because only ibmvnic_tasklet tests/clears this. @@ -5447,6 +5448,23 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq, ibmvnic_reset(adapter, VNIC_RESET_FATAL); return -EIO; } + + rsp_len = be32_to_cpu(login_rsp->len); + if (be32_to_cpu(login->login_rsp_len) < rsp_len || + rsp_len <= be32_to_cpu(login_rsp->off_txsubm_subcrqs) || + rsp_len <= be32_to_cpu(login_rsp->off_rxadd_subcrqs) || + rsp_len <= be32_to_cpu(login_rsp->off_rxadd_buff_size) || + rsp_len <= be32_to_cpu(login_rsp->off_supp_tx_desc)) { + /* This can happen if a login request times out and there are + * 2 outstanding login requests sent, the LOGIN_RSP crq + * could have been for the older login request. So we are + * parsing the newer response buffer which may be incomplete + */ + dev_err(dev, "FATAL: Login rsp offsets/lengths invalid\n"); + ibmvnic_reset(adapter, VNIC_RESET_FATAL); + return -EIO; + } + size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size)); /* variable buffer sizes are not supported, so just read the From patchwork Thu Aug 3 20:20:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Child X-Patchwork-Id: 13340826 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9165C26B07 for ; Thu, 3 Aug 2023 20:20:18 +0000 (UTC) Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 874264229 for ; Thu, 3 Aug 2023 13:20:16 -0700 (PDT) Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 373KB6Gp010192 for ; Thu, 3 Aug 2023 20:20:15 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=7sI/QErqFobe3fCHLMewVX2bEwujINCfGqOQpNeK+84=; b=n6wMicuHydXGCOzlyRWZs0VA5a308OJYwFi7z0sVQINyZYfn07YJS/MB7kVDFSqSZyQp FUIMOWg2dPE196EGJs0ACToLIWBK0nltbCvILdswQzD0RVJJtZodXcJhFpWm/gIL9Jzs 7o5VJJiLGa9GKPUkZPsJuIKyiq13vow94Cc/d0Xiln1eH3UbEULtYewYvK5oUADvA4nX OMPRWablD9IMGI2Hik4c/IBPec8+OCnkvumVmw0HuVlSWkTv7cBXgEZIMRNFBFZaOGMw SFtBInZds5VfiTAwjwdGX2DOtPzj5u3d/qWY+UXBQCLhPuLrt8TolOhyDkskwyv7icgS eQ== Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3s8k23gmm7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 03 Aug 2023 20:20:15 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 373JSI2c014553 for ; Thu, 3 Aug 2023 20:20:14 GMT Received: from smtprelay03.wdc07v.mail.ibm.com ([172.16.1.70]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3s5ft1yvs3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 03 Aug 2023 20:20:14 +0000 Received: from smtpav02.dal12v.mail.ibm.com (smtpav02.dal12v.mail.ibm.com [10.241.53.101]) by smtprelay03.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 373KKBOS2097810 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 3 Aug 2023 20:20:12 GMT Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BC2995805C; Thu, 3 Aug 2023 20:20:11 +0000 (GMT) Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8CF475805A; Thu, 3 Aug 2023 20:20:11 +0000 (GMT) Received: from li-8d37cfcc-31b9-11b2-a85c-83226d7135c9.austin.ibm.com (unknown [9.24.4.46]) by smtpav02.dal12v.mail.ibm.com (Postfix) with ESMTP; Thu, 3 Aug 2023 20:20:11 +0000 (GMT) From: Nick Child To: netdev@vger.kernel.org Cc: haren@linux.ibm.com, ricklind@us.ibm.com, danymadden@us.ibm.com, tlfalcon@linux.ibm.com, bjking1@linux.ibm.com, Nick Child Subject: [PATCH net 2/5] ibmvnic: Unmap DMA login rsp buffer on send login fail Date: Thu, 3 Aug 2023 15:20:07 -0500 Message-Id: <20230803202010.37149-2-nnac123@linux.ibm.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230803202010.37149-1-nnac123@linux.ibm.com> References: <20230803202010.37149-1-nnac123@linux.ibm.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: u75qdyhIvXuX4S-Xgq87GjbCccYAW0jn X-Proofpoint-ORIG-GUID: u75qdyhIvXuX4S-Xgq87GjbCccYAW0jn X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-08-03_22,2023-08-03_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 malwarescore=0 phishscore=0 clxscore=1015 mlxscore=0 lowpriorityscore=0 impostorscore=0 bulkscore=0 adultscore=0 mlxlogscore=648 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308030180 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org If the LOGIN CRQ fails to send then we must DMA unmap the response buffer. Previously, if the CRQ failed then the memory was freed without DMA unmapping. Fixes: c98d9cc4170d ("ibmvnic: send_login should check for crq errors") Signed-off-by: Nick Child Reviewed-by: Simon Horman --- drivers/net/ethernet/ibm/ibmvnic.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 996f8037c266..fdc0bac029ad 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -4830,11 +4830,14 @@ static int send_login(struct ibmvnic_adapter *adapter) if (rc) { adapter->login_pending = false; netdev_err(adapter->netdev, "Failed to send login, rc=%d\n", rc); - goto buf_rsp_map_failed; + goto buf_send_failed; } return 0; +buf_send_failed: + dma_unmap_single(dev, rsp_buffer_token, rsp_buffer_size, + DMA_FROM_DEVICE); buf_rsp_map_failed: kfree(login_rsp_buffer); adapter->login_rsp_buf = NULL; From patchwork Thu Aug 3 20:20:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Child X-Patchwork-Id: 13340822 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22B7063BC for ; Thu, 3 Aug 2023 20:20:16 +0000 (UTC) Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97FE23C31 for ; Thu, 3 Aug 2023 13:20:15 -0700 (PDT) Received: from pps.filterd (m0353727.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 373KBc91022297 for ; Thu, 3 Aug 2023 20:20:15 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=OVB+JD3ZO1Gqmr9MEF7YsVD7CZ7kqxj6z8KS3cWmVMw=; b=E4WKozmUMtLn41+iGc2sVlOw9KAtWDr0xrVCP4WQgIiZho3JnhJeIXMtDmzy+hKnV8mG 7tc4XjeeR9drNuA+umvwnzeB3qH9W4Tkv1wHZtExOj0902eVGkxCHCof6NnCNfGKxvjy /vCK/lcd7B/1JRYsY+Wp27XAA2ExIf65EOjZIbbN9A3Py22lemt9jCswh+2dzYRxS24I HuU9ORuVUuQAJS9NmESLQk5E7KHkAVfY7CYHxY1HaunNa9Wfm+iyr3z0tOOEU/y4c+lk URJSHWKTb2rIm3whHuaPHYyUwJWMFamQzY8F8C5BeajamFCazdv6myIJfb5idqcqTIET +A== Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3s8ju4ree6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 03 Aug 2023 20:20:15 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 373JVGAq017033 for ; Thu, 3 Aug 2023 20:20:13 GMT Received: from smtprelay02.dal12v.mail.ibm.com ([172.16.1.4]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3s5dfyrrtq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 03 Aug 2023 20:20:13 +0000 Received: from smtpav02.dal12v.mail.ibm.com (smtpav02.dal12v.mail.ibm.com [10.241.53.101]) by smtprelay02.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 373KKCWE27001488 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 3 Aug 2023 20:20:12 GMT Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E8CE75805F; Thu, 3 Aug 2023 20:20:11 +0000 (GMT) Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C224E5805E; Thu, 3 Aug 2023 20:20:11 +0000 (GMT) Received: from li-8d37cfcc-31b9-11b2-a85c-83226d7135c9.austin.ibm.com (unknown [9.24.4.46]) by smtpav02.dal12v.mail.ibm.com (Postfix) with ESMTP; Thu, 3 Aug 2023 20:20:11 +0000 (GMT) From: Nick Child To: netdev@vger.kernel.org Cc: haren@linux.ibm.com, ricklind@us.ibm.com, danymadden@us.ibm.com, tlfalcon@linux.ibm.com, bjking1@linux.ibm.com, Nick Child Subject: [PATCH net 3/5] ibmvnic: Handle DMA unmapping of login buffs in release functions Date: Thu, 3 Aug 2023 15:20:08 -0500 Message-Id: <20230803202010.37149-3-nnac123@linux.ibm.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230803202010.37149-1-nnac123@linux.ibm.com> References: <20230803202010.37149-1-nnac123@linux.ibm.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: zUsHlEMr5KuzfRDlvn3KxMqjZC7j3X7N X-Proofpoint-GUID: zUsHlEMr5KuzfRDlvn3KxMqjZC7j3X7N X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-08-03_22,2023-08-03_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 spamscore=0 impostorscore=0 adultscore=0 mlxlogscore=705 suspectscore=0 mlxscore=0 priorityscore=1501 clxscore=1015 lowpriorityscore=0 phishscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308030180 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org Rather than leaving the DMA unmapping of the login buffers to the login response handler, move this work into the login release functions. Previously, these functions were only used for freeing the allocated buffers. This could lead to issues if there are more than one outstanding login buffer requests, which is possible if a login request times out. If a login request times out, then there is another call to send login. The send login function makes a call to the login buffer release function. In the past, this freed the buffers but did not DMA unmap. Therefore, the VIOS could still write to the old login (now freed) buffer. It is for this reason that it is a good idea to leave the DMA unmap call to the login buffers release function. Since the login buffer release functions now handle DMA unmapping, remove the duplicate DMA unmapping in handle_login_rsp(). Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child Reviewed-by: Simon Horman --- drivers/net/ethernet/ibm/ibmvnic.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index fdc0bac029ad..718af76fd711 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -1588,12 +1588,22 @@ static int ibmvnic_login(struct net_device *netdev) static void release_login_buffer(struct ibmvnic_adapter *adapter) { + if (!adapter->login_buf) + return; + + dma_unmap_single(&adapter->vdev->dev, adapter->login_buf_token, + adapter->login_buf_sz, DMA_TO_DEVICE); kfree(adapter->login_buf); adapter->login_buf = NULL; } static void release_login_rsp_buffer(struct ibmvnic_adapter *adapter) { + if (!adapter->login_rsp_buf) + return; + + dma_unmap_single(&adapter->vdev->dev, adapter->login_rsp_buf_token, + adapter->login_rsp_buf_sz, DMA_FROM_DEVICE); kfree(adapter->login_rsp_buf); adapter->login_rsp_buf = NULL; } @@ -5411,11 +5421,6 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq, } adapter->login_pending = false; - dma_unmap_single(dev, adapter->login_buf_token, adapter->login_buf_sz, - DMA_TO_DEVICE); - dma_unmap_single(dev, adapter->login_rsp_buf_token, - adapter->login_rsp_buf_sz, DMA_FROM_DEVICE); - /* If the number of queues requested can't be allocated by the * server, the login response will return with code 1. We will need * to resend the login buffer with fewer queues requested. From patchwork Thu Aug 3 20:20:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Child X-Patchwork-Id: 13340824 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 23C2B263B2 for ; Thu, 3 Aug 2023 20:20:17 +0000 (UTC) Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98A084224 for ; Thu, 3 Aug 2023 13:20:15 -0700 (PDT) Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 373KGZMc030544 for ; Thu, 3 Aug 2023 20:20:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=OTfdm3xeOC761Ksp9jMl6jDzPD+XsiyvyzBKqdJkuG8=; b=KcjtApKZ1DGKbaMbf2jzHcqdnLLs0m9c40BGENqhIPjPAsTEgG1Qtlh+SqVf5nKve08O MYE7ikznwGnqqBz9/GDKiSz71tB9c61fNHIdkNvSfmsaZNTAys1C6oK9g/2Wk1vIW5pL QX3bIAavYmd42mFNCuc7G5/EcLKhdG+QddgGpMVKzSESJ8Cvph7XLMX+j+e2IIu9yfVY NLpsl7qoymdFjDXLneDslbrxFAZ2enPDdJOnwSCVLLvk7QidHkqkqoon48jv8bXD9TA2 wy3evKlSdH7/Ktv4ZlNgF9afrxG8sJMPo4L+AxeDb+2/RFOlEhOM8BBkVdBZwUZjo7Pa bQ== Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3s8jw60gb8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 03 Aug 2023 20:20:14 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 373IiqCE006073 for ; Thu, 3 Aug 2023 20:20:13 GMT Received: from smtprelay02.dal12v.mail.ibm.com ([172.16.1.4]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3s5d3t0yr0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 03 Aug 2023 20:20:13 +0000 Received: from smtpav02.dal12v.mail.ibm.com (smtpav02.dal12v.mail.ibm.com [10.241.53.101]) by smtprelay02.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 373KKClQ43975106 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 3 Aug 2023 20:20:12 GMT Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2196358051; Thu, 3 Aug 2023 20:20:12 +0000 (GMT) Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F031958060; Thu, 3 Aug 2023 20:20:11 +0000 (GMT) Received: from li-8d37cfcc-31b9-11b2-a85c-83226d7135c9.austin.ibm.com (unknown [9.24.4.46]) by smtpav02.dal12v.mail.ibm.com (Postfix) with ESMTP; Thu, 3 Aug 2023 20:20:11 +0000 (GMT) From: Nick Child To: netdev@vger.kernel.org Cc: haren@linux.ibm.com, ricklind@us.ibm.com, danymadden@us.ibm.com, tlfalcon@linux.ibm.com, bjking1@linux.ibm.com, Nick Child Subject: [PATCH net 4/5] ibmvnic: Do partial reset on login failure Date: Thu, 3 Aug 2023 15:20:09 -0500 Message-Id: <20230803202010.37149-4-nnac123@linux.ibm.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230803202010.37149-1-nnac123@linux.ibm.com> References: <20230803202010.37149-1-nnac123@linux.ibm.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: BCLJL_OL0BNYZzvl-_VU7s0byYLmdE82 X-Proofpoint-ORIG-GUID: BCLJL_OL0BNYZzvl-_VU7s0byYLmdE82 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-08-03_22,2023-08-03_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 priorityscore=1501 impostorscore=0 mlxlogscore=919 phishscore=0 mlxscore=0 adultscore=0 clxscore=1015 bulkscore=0 lowpriorityscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308030180 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org Perform a partial reset before sending a login request if any of the following are true: 1. If a previous request times out. This can be dangerous because the VIOS could still receive the old login request at any point after the timeout. Therefore, it is best to re-register the CRQ's and sub-CRQ's before retrying. 2. If the previous request returns an error that is not described in PAPR. PAPR provides procedures if the login returns with partial success or aborted return codes (section L.5.1) but other values do not have a defined procedure. Previously, these conditions just returned error from the login function rather than trying to resolve the issue. This can cause further issues since most callers of the login function are not prepared to handle an error when logging in. This improper cleanup can lead to the device being permanently DOWN'd. For example, if the VIOS believes that the device is already logged in then it will return INVALID_STATE (-7). If we never re-register CRQ's then it will always think that the device is already logged in. This leaves the device inoperable. The partial reset involves freeing the sub-CRQs, freeing the CRQ then registering and initializing a new CRQ and sub-CRQs. This essentially restarts all communication with VIOS to allow for a fresh login attempt that will be unhindered by any previous failed attempts. Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child Reviewed-by: Simon Horman --- drivers/net/ethernet/ibm/ibmvnic.c | 46 ++++++++++++++++++++++++++---- 1 file changed, 40 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 718af76fd711..8fd9639665a0 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -97,6 +97,8 @@ static int pending_scrq(struct ibmvnic_adapter *, static union sub_crq *ibmvnic_next_scrq(struct ibmvnic_adapter *, struct ibmvnic_sub_crq_queue *); static int ibmvnic_poll(struct napi_struct *napi, int data); +static int reset_sub_crq_queues(struct ibmvnic_adapter *adapter); +static inline void reinit_init_done(struct ibmvnic_adapter *adapter); static void send_query_map(struct ibmvnic_adapter *adapter); static int send_request_map(struct ibmvnic_adapter *, dma_addr_t, u32, u8); static int send_request_unmap(struct ibmvnic_adapter *, u8); @@ -1527,11 +1529,9 @@ static int ibmvnic_login(struct net_device *netdev) if (!wait_for_completion_timeout(&adapter->init_done, timeout)) { - netdev_warn(netdev, "Login timed out, retrying...\n"); - retry = true; - adapter->init_done_rc = 0; - retry_count++; - continue; + netdev_warn(netdev, "Login timed out\n"); + adapter->login_pending = false; + goto partial_reset; } if (adapter->init_done_rc == ABORTED) { @@ -1576,7 +1576,41 @@ static int ibmvnic_login(struct net_device *netdev) } else if (adapter->init_done_rc) { netdev_warn(netdev, "Adapter login failed, init_done_rc = %d\n", adapter->init_done_rc); - return -EIO; + +partial_reset: + /* adapter login failed, so free any CRQs or sub-CRQs + * and register again before attempting to login again. + * If we don't do this then the VIOS may think that + * we are already logged in and reject any subsequent + * attempts + */ + netdev_warn(netdev, + "Freeing and re-registering CRQs before attempting to login again\n"); + retry = true; + adapter->init_done_rc = 0; + retry_count++; + release_sub_crqs(adapter, true); + reinit_init_done(adapter); + release_crq_queue(adapter); + /* If we don't sleep here then we risk an unnecessary + * failover event from the VIOS. This is a known VIOS + * issue caused by a vnic device freeing and registering + * a CRQ too quickly. + */ + msleep(1500); + rc = init_crq_queue(adapter); + if (rc) { + netdev_err(netdev, "login recovery: init CRQ failed %d\n", + rc); + return -EIO; + } + + rc = ibmvnic_reset_init(adapter, false); + if (rc) { + netdev_err(netdev, "login recovery: Reset init failed %d\n", + rc); + return -EIO; + } } } while (retry); From patchwork Thu Aug 3 20:20:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Child X-Patchwork-Id: 13340825 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55637263D9 for ; Thu, 3 Aug 2023 20:20:18 +0000 (UTC) Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CF274226 for ; Thu, 3 Aug 2023 13:20:15 -0700 (PDT) Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 373KCshO002919 for ; Thu, 3 Aug 2023 20:20:15 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=FsYaVKA1dPZ/VT/qtPdH8Fml8Ba9b1E03zRHSSSVrSg=; b=XTtOJV3HtoYHwZGreAGW2eyPfzI02vVekZKmy76FPqQTuZN9AX6be8Ckb5ciut7dAvpw UoiZT77g1PClU2SatdYvxwzjbtE0xNsQfscLJc37is+E+IkdzDhHRzGW3ZUjy3DNuHGc 1VxbJ3RG549Sx2YurqRmWBpS25yYJsWILaDrmdJFmoSlldrXjH3KEbDAAbCQLS07DerE mPqdrGBEyoaR+Pa6JXHkRf+hVyn+bNFlS+i7KniH9mjnLnxd/KO5k5oK8Egwkn4IjVeB TzewbK7GWTnWP1e46eRyHvvgQhvSXvADFnDSmli1K5wegI2DNtIGsTeSNf3Bl/CF/3K9 ow== Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3s8k2tr7ux-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 03 Aug 2023 20:20:15 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 373KJ4Om015559 for ; Thu, 3 Aug 2023 20:20:13 GMT Received: from smtprelay02.dal12v.mail.ibm.com ([172.16.1.4]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3s5e3ngh68-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 03 Aug 2023 20:20:13 +0000 Received: from smtpav02.dal12v.mail.ibm.com (smtpav02.dal12v.mail.ibm.com [10.241.53.101]) by smtprelay02.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 373KKCod43975108 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 3 Aug 2023 20:20:12 GMT Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4EF925805C; Thu, 3 Aug 2023 20:20:12 +0000 (GMT) Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 298FF5805A; Thu, 3 Aug 2023 20:20:12 +0000 (GMT) Received: from li-8d37cfcc-31b9-11b2-a85c-83226d7135c9.austin.ibm.com (unknown [9.24.4.46]) by smtpav02.dal12v.mail.ibm.com (Postfix) with ESMTP; Thu, 3 Aug 2023 20:20:12 +0000 (GMT) From: Nick Child To: netdev@vger.kernel.org Cc: haren@linux.ibm.com, ricklind@us.ibm.com, danymadden@us.ibm.com, tlfalcon@linux.ibm.com, bjking1@linux.ibm.com, Nick Child Subject: [PATCH net 5/5] ibmvnic: Ensure login failure recovery is safe from other resets Date: Thu, 3 Aug 2023 15:20:10 -0500 Message-Id: <20230803202010.37149-5-nnac123@linux.ibm.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230803202010.37149-1-nnac123@linux.ibm.com> References: <20230803202010.37149-1-nnac123@linux.ibm.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Dm9qyCbq3kUh-up_F91OU73BUkD4PFDK X-Proofpoint-GUID: Dm9qyCbq3kUh-up_F91OU73BUkD4PFDK X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-08-03_22,2023-08-03_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 mlxscore=0 mlxlogscore=999 adultscore=0 lowpriorityscore=0 priorityscore=1501 spamscore=0 impostorscore=0 bulkscore=0 phishscore=0 malwarescore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308030180 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org If a login request fails, the recovery process should be protected against parallel resets. It is a known issue that freeing and registering CRQ's in quick succession can result in a failover CRQ from the VIOS. Processing a failover during login recovery is dangerous for two reasons: 1. This will result in two parallel initialization processes, this can cause serious issues during login. 2. It is possible that the failover CRQ is received but never executed. We get notified of a pending failover through a transport event CRQ. The reset is not performed until a INIT CRQ request is received. Previously, if CRQ init fails during login recovery, then the ibmvnic irq is freed and the login process returned error. If failover_pending is true (a transport event was received), then the ibmvnic device would never be able to process the reset since it cannot receive the CRQ_INIT request due to the irq being freed. This leaved the device in a inoperable state. Therefore, the login failure recovery process must be hardened against these possible issues. Possible failovers (due to quick CRQ free and init) must be avoided and any issues during re-initialization should be dealt with instead of being propagated up the stack. This logic is similar to that of ibmvnic_probe(). Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child Reviewed-by: Simon Horman --- drivers/net/ethernet/ibm/ibmvnic.c | 67 ++++++++++++++++++++---------- 1 file changed, 46 insertions(+), 21 deletions(-) diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 8fd9639665a0..77df62511574 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -116,6 +116,7 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter, static void free_long_term_buff(struct ibmvnic_adapter *adapter, struct ibmvnic_long_term_buff *ltb); static void ibmvnic_disable_irqs(struct ibmvnic_adapter *adapter); +static void flush_reset_queue(struct ibmvnic_adapter *adapter); struct ibmvnic_stat { char name[ETH_GSTRING_LEN]; @@ -1508,7 +1509,7 @@ static const char *adapter_state_to_string(enum vnic_state state) static int ibmvnic_login(struct net_device *netdev) { struct ibmvnic_adapter *adapter = netdev_priv(netdev); - unsigned long timeout = msecs_to_jiffies(20000); + unsigned long flags, timeout = msecs_to_jiffies(20000); int retry_count = 0; int retries = 10; bool retry; @@ -1590,27 +1591,52 @@ static int ibmvnic_login(struct net_device *netdev) adapter->init_done_rc = 0; retry_count++; release_sub_crqs(adapter, true); - reinit_init_done(adapter); - release_crq_queue(adapter); - /* If we don't sleep here then we risk an unnecessary - * failover event from the VIOS. This is a known VIOS - * issue caused by a vnic device freeing and registering - * a CRQ too quickly. + /* Much of this is similar logic as ibmvnic_probe(), + * we are essentially re-initializing communication + * with the server. We really should not run any + * resets/failovers here because this is already a form + * of reset and we do not want parallel resets occurring */ - msleep(1500); - rc = init_crq_queue(adapter); - if (rc) { - netdev_err(netdev, "login recovery: init CRQ failed %d\n", - rc); - return -EIO; - } + do { + reinit_init_done(adapter); + /* Clear any failovers we got in the previous + * pass since we are re-initializing the CRQ + */ + adapter->failover_pending = false; + release_crq_queue(adapter); + /* If we don't sleep here then we risk an + * unnecessary failover event from the VIOS. + * This is a known VIOS issue caused by a vnic + * device freeing and registering a CRQ too + * quickly. + */ + msleep(1500); + /* Avoid any resets, since we are currently + * resetting. + */ + spin_lock_irqsave(&adapter->rwi_lock, flags); + flush_reset_queue(adapter); + spin_unlock_irqrestore(&adapter->rwi_lock, + flags); + + rc = init_crq_queue(adapter); + if (rc) { + netdev_err(netdev, "login recovery: init CRQ failed %d\n", + rc); + return -EIO; + } - rc = ibmvnic_reset_init(adapter, false); - if (rc) { - netdev_err(netdev, "login recovery: Reset init failed %d\n", - rc); - return -EIO; - } + rc = ibmvnic_reset_init(adapter, false); + if (rc) + netdev_err(netdev, "login recovery: Reset init failed %d\n", + rc); + /* IBMVNIC_CRQ_INIT will return EAGAIN if it + * fails, since ibmvnic_reset_init will free + * irq's in failure, we won't be able to receive + * new CRQs so we need to keep trying. probe() + * handles this similarly. + */ + } while (rc == -EAGAIN); } } while (retry); @@ -1903,7 +1929,6 @@ static int ibmvnic_open(struct net_device *netdev) int rc; ASSERT_RTNL(); - /* If device failover is pending or we are about to reset, just set * device state and return. Device operation will be handled by reset * routine.