diff mbox series

[net,1/7] Revert "ibmvnic: simplify reset_long_term_buff function"

Message ID 20210624041316.567622-2-sukadev@linux.ibm.com (mailing list archive)
State Accepted
Commit 0ec13aff058a82426c8d44b688c804cc4a5a0a3d
Delegated to: Netdev Maintainers
Headers show
Series ibmvnic: Assorted bug fixes | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers fail 2 blamed authors not CCed: ljp@linux.ibm.com davem@davemloft.net; 8 maintainers not CCed: tlfalcon@linux.ibm.com paulus@samba.org benh@kernel.crashing.org linuxppc-dev@lists.ozlabs.org mpe@ellerman.id.au ljp@linux.ibm.com davem@davemloft.net kuba@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 3 this patch: 3
netdev/verify_fixes fail Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 78 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link

Commit Message

Sukadev Bhattiprolu June 24, 2021, 4:13 a.m. UTC
This reverts commit 1c7d45e7b2c29080bf6c8cd0e213cc3cbb62a054.

We tried to optimize the number of hcalls we send and skipped sending
the REQUEST_MAP calls for some maps. However during resets, we need to
resend all the maps to the VIOS since the VIOS does not remember the
old values. In fact we may have failed over to a new VIOS which will
not have any of the mappings.

When we send packets with map ids the VIOS does not know about, it
triggers a FATAL reset. While the client does recover from the FATAL
error reset, we are seeing a large number of such resets. Handling
FATAL resets is lot more unnecessary work than issuing a few more
hcalls so revert the commit and resend the maps to the VIOS.

Fixes: 1c7d45e7b2c ("ibmvnic: simplify reset_long_term_buff function")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 46 ++++++++++++++++++++++++------
 1 file changed, 38 insertions(+), 8 deletions(-)

Comments

Lijun Pan June 24, 2021, 6:07 a.m. UTC | #1
On Wed, Jun 23, 2021 at 11:16 PM Sukadev Bhattiprolu
<sukadev@linux.ibm.com> wrote:
>
> This reverts commit 1c7d45e7b2c29080bf6c8cd0e213cc3cbb62a054.
>
> We tried to optimize the number of hcalls we send and skipped sending
> the REQUEST_MAP calls for some maps. However during resets, we need to
> resend all the maps to the VIOS since the VIOS does not remember the
> old values. In fact we may have failed over to a new VIOS which will
> not have any of the mappings.
>
> When we send packets with map ids the VIOS does not know about, it
> triggers a FATAL reset. While the client does recover from the FATAL
> error reset, we are seeing a large number of such resets. Handling
> FATAL resets is lot more unnecessary work than issuing a few more
> hcalls so revert the commit and resend the maps to the VIOS.
>

This was not an issue when the original optimization code was committed.
VIOS changes over time and it is proprietary code, so people don't really know
what it changes every time. If you believe the verbose hcall is really
necessary,
you'd better document that in the function/source code. This patch may
be reverted again
some time later when the verbose calling isn't needed.
Sukadev Bhattiprolu June 24, 2021, 4:07 p.m. UTC | #2
Lijun Pan [lijunp213@gmail.com] wrote:
> On Wed, Jun 23, 2021 at 11:16 PM Sukadev Bhattiprolu
> <sukadev@linux.ibm.com> wrote:
> >
> > This reverts commit 1c7d45e7b2c29080bf6c8cd0e213cc3cbb62a054.
> >
> > We tried to optimize the number of hcalls we send and skipped sending
> > the REQUEST_MAP calls for some maps. However during resets, we need to
> > resend all the maps to the VIOS since the VIOS does not remember the
> > old values. In fact we may have failed over to a new VIOS which will
> > not have any of the mappings.
> >
> > When we send packets with map ids the VIOS does not know about, it
> > triggers a FATAL reset. While the client does recover from the FATAL
> > error reset, we are seeing a large number of such resets. Handling
> > FATAL resets is lot more unnecessary work than issuing a few more
> > hcalls so revert the commit and resend the maps to the VIOS.
> >
> 
> This was not an issue when the original optimization code was committed.
> VIOS changes over time and it is proprietary code, so people don't really know
> what it changes every time.

All the more reason to be careful about ripping out code.

>If you believe the verbose hcall is really necessary,
> you'd better document that in the function/source code.

It was necessary and present until you removed it. I am reverting it
after lot of debugging and with sufficient explanation. Feel free to
submit a new patch.

>This patch may be reverted again
> some time later when the verbose calling isn't needed.

Hopefully not without sufficient testing.

Sukadev

Sukadev
diff mbox series

Patch

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index adb0d5ca9ff1..f13ad6bc67cd 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -280,12 +280,40 @@  static void free_long_term_buff(struct ibmvnic_adapter *adapter,
 	dma_free_coherent(dev, ltb->size, ltb->buff, ltb->addr);
 }
 
-static int reset_long_term_buff(struct ibmvnic_long_term_buff *ltb)
+static int reset_long_term_buff(struct ibmvnic_adapter *adapter,
+				struct ibmvnic_long_term_buff *ltb)
 {
-	if (!ltb->buff)
-		return -EINVAL;
+	struct device *dev = &adapter->vdev->dev;
+	int rc;
 
 	memset(ltb->buff, 0, ltb->size);
+
+	mutex_lock(&adapter->fw_lock);
+	adapter->fw_done_rc = 0;
+
+	reinit_completion(&adapter->fw_done);
+	rc = send_request_map(adapter, ltb->addr, ltb->size, ltb->map_id);
+	if (rc) {
+		mutex_unlock(&adapter->fw_lock);
+		return rc;
+	}
+
+	rc = ibmvnic_wait_for_completion(adapter, &adapter->fw_done, 10000);
+	if (rc) {
+		dev_info(dev,
+			 "Reset failed, long term map request timed out or aborted\n");
+		mutex_unlock(&adapter->fw_lock);
+		return rc;
+	}
+
+	if (adapter->fw_done_rc) {
+		dev_info(dev,
+			 "Reset failed, attempting to free and reallocate buffer\n");
+		free_long_term_buff(adapter, ltb);
+		mutex_unlock(&adapter->fw_lock);
+		return alloc_long_term_buff(adapter, ltb, ltb->size);
+	}
+	mutex_unlock(&adapter->fw_lock);
 	return 0;
 }
 
@@ -507,7 +535,8 @@  static int reset_rx_pools(struct ibmvnic_adapter *adapter)
 						  rx_pool->size *
 						  rx_pool->buff_size);
 		} else {
-			rc = reset_long_term_buff(&rx_pool->long_term_buff);
+			rc = reset_long_term_buff(adapter,
+						  &rx_pool->long_term_buff);
 		}
 
 		if (rc)
@@ -630,11 +659,12 @@  static int init_rx_pools(struct net_device *netdev)
 	return 0;
 }
 
-static int reset_one_tx_pool(struct ibmvnic_tx_pool *tx_pool)
+static int reset_one_tx_pool(struct ibmvnic_adapter *adapter,
+			     struct ibmvnic_tx_pool *tx_pool)
 {
 	int rc, i;
 
-	rc = reset_long_term_buff(&tx_pool->long_term_buff);
+	rc = reset_long_term_buff(adapter, &tx_pool->long_term_buff);
 	if (rc)
 		return rc;
 
@@ -661,10 +691,10 @@  static int reset_tx_pools(struct ibmvnic_adapter *adapter)
 
 	tx_scrqs = adapter->num_active_tx_pools;
 	for (i = 0; i < tx_scrqs; i++) {
-		rc = reset_one_tx_pool(&adapter->tso_pool[i]);
+		rc = reset_one_tx_pool(adapter, &adapter->tso_pool[i]);
 		if (rc)
 			return rc;
-		rc = reset_one_tx_pool(&adapter->tx_pool[i]);
+		rc = reset_one_tx_pool(adapter, &adapter->tx_pool[i]);
 		if (rc)
 			return rc;
 	}