mbox series

[net-next,0/9] ibmvnic: Reuse ltb, rx, tx pools

Message ID 20210901000812.120968-1-sukadev@linux.ibm.com (mailing list archive)
Headers show
Series ibmvnic: Reuse ltb, rx, tx pools | expand

Message

Sukadev Bhattiprolu Sept. 1, 2021, 12:08 a.m. UTC
It can take a long time to free and reallocate rx and tx pools and long
term buffer (LTB) during each reset of the VNIC. This is specially true
when the partition (LPAR) is heavily loaded and going through a Logical
Partition Migration (LPM). The long drawn reset causes the LPAR to lose
connectivity for extended periods of time and results in "RMC connection"
errors and the LPM failing.

What is worse is that during the LPM we could get a failover because
of the lost connectivity. At that point, the vnic driver releases
even the resources it has already allocated and starts over.

As long as the resources we have already allocated are valid/applicable,
we might as well hold on to them while trying to allocate the remaining
resources. This patch set attempts to reuse the resources previously
allocated as long as they are valid. It seems to vastly improve the
time taken for the vnic reset. We have also not seen any RMC connection
issues during our testing with this patch set.

If the backing devices for a vnic adapter are not "matched" (see "pool
parameters" in patches 8 and 9) it is possible that we will still free
all the resources and allocate them. If that becomes a common problem,
we have to address it separately.

Thanks to input and extensive testing from Brian King, Cris Forno,
Dany Madden, Rick Lindsley.

Sukadev Bhattiprolu (9):
  ibmvnic: consolidate related code in replenish_rx_pool()
  ibmvnic: Fix up some comments and messages
  ibmvnic: Use/rename local vars in init_rx_pools
  ibmvnic: Use/rename local vars in init_tx_pools
  ibmvnic: init_tx_pools move loop-invariant code out
  ibmvnic: use bitmap for LTB map_ids
  ibmvnic: Reuse LTB when possible
  ibmvnic: Reuse rx pools when possible
  ibmvnic: Reuse tx pools when possible

 drivers/net/ethernet/ibm/ibmvnic.c | 592 ++++++++++++++++++-----------
 drivers/net/ethernet/ibm/ibmvnic.h |  10 +-
 2 files changed, 379 insertions(+), 223 deletions(-)

Comments

Rick Lindsley Sept. 1, 2021, 1:21 a.m. UTC | #1
On 8/31/21 5:08 PM, Sukadev Bhattiprolu wrote:
> It can take a long time to free and reallocate rx and tx pools and long
> term buffer (LTB) during each reset of the VNIC. This is specially true
> when the partition (LPAR) is heavily loaded and going through a Logical
> Partition Migration (LPM). The long drawn reset causes the LPAR to lose
> connectivity for extended periods of time and results in "RMC connection"
> errors and the LPM failing.
> 
> What is worse is that during the LPM we could get a failover because
> of the lost connectivity. At that point, the vnic driver releases
> even the resources it has already allocated and starts over.
> 
> As long as the resources we have already allocated are valid/applicable,
> we might as well hold on to them while trying to allocate the remaining
> resources. This patch set attempts to reuse the resources previously
> allocated as long as they are valid. It seems to vastly improve the
> time taken for the vnic reset. We have also not seen any RMC connection
> issues during our testing with this patch set.
> 
> If the backing devices for a vnic adapter are not "matched" (see "pool
> parameters" in patches 8 and 9) it is possible that we will still free
> all the resources and allocate them. If that becomes a common problem,
> we have to address it separately.

We have spent a good amount of time going over this. Thans for all the work.

Reviewed-by:  Rick Lindsley <ricklind@linux.vnet.ibm.com>
Jakub Kicinski Sept. 1, 2021, 2:35 a.m. UTC | #2
On Tue, 31 Aug 2021 17:08:03 -0700 Sukadev Bhattiprolu wrote:
> It can take a long time to free and reallocate rx and tx pools and long
> term buffer (LTB) during each reset of the VNIC. This is specially true
> when the partition (LPAR) is heavily loaded and going through a Logical
> Partition Migration (LPM). The long drawn reset causes the LPAR to lose
> connectivity for extended periods of time and results in "RMC connection"
> errors and the LPM failing.
> 
> What is worse is that during the LPM we could get a failover because
> of the lost connectivity. At that point, the vnic driver releases
> even the resources it has already allocated and starts over.
> 
> As long as the resources we have already allocated are valid/applicable,
> we might as well hold on to them while trying to allocate the remaining
> resources. This patch set attempts to reuse the resources previously
> allocated as long as they are valid. It seems to vastly improve the
> time taken for the vnic reset. We have also not seen any RMC connection
> issues during our testing with this patch set.
> 
> If the backing devices for a vnic adapter are not "matched" (see "pool
> parameters" in patches 8 and 9) it is possible that we will still free
> all the resources and allocate them. If that becomes a common problem,
> we have to address it separately.
> 
> Thanks to input and extensive testing from Brian King, Cris Forno,
> Dany Madden, Rick Lindsley.

Please fix the kdoc issues in this submission. Kdoc script is your
friend:

./scripts/kernel-doc -none <files>
Sukadev Bhattiprolu Sept. 1, 2021, 6:07 p.m. UTC | #3
Jakub Kicinski [kuba@kernel.org] wrote:
> 
> Please fix the kdoc issues in this submission. Kdoc script is your
> friend:
> 
> ./scripts/kernel-doc -none <files>

Yes, Thanks! I fixed them in v2

Sukadev