mbox series

[for-next,0/6] Perf and debug fixes for hfi

Message ID 20210913132317.131370.54825.stgit@awfm-01.cornelisnetworks.com (mailing list archive)
Headers show
Series Perf and debug fixes for hfi | expand

Message

Dennis Dalessandro Sept. 13, 2021, 1:28 p.m. UTC
Here is a series of perf improvements and debug/trace fixes from Mike,
who has this to say about the patches...

The AIP SDMA interrupt handling is inefficient:

- A slab entry is allocated for each sent packet

  This is despite the fact that there is a ring for each possible send slot
  that could be occupied by a tx descriptor

- The interrupt handling/NAPI is lock happy has a mixed up notion of
  producer and consumer

  The ring should be a ring of tx descriptors vs. a ring of pointers

  The consumer of descriptors should be the xmit side of the TX

  The producer of the descriptors is the SDMA interrupt handling and NAPI
  tx completion

  There is certainly no locking required in the interrupt/TX napi tx queue

  There is no locking required in the xmit side since that is held off by NAPI
  code

Note that these patches are also staged publicly on our GitHub site for easy
browsing in context.

https://github.com/cornelisnetworks/linux

---

Mike Marciniszyn (6):
      IB/hfi1: Remove cache and embed txreq in ring
      IB/hfi1: Get rid of hot path divide
      IB/hfi1: Get rid of tx priv backpointer
      IB/hfi1: Tune netdev xmit cachelines
      IB/hfi1: Remove atomic completion count
      IB/hfi1: Add ring consumer and producers traces


 drivers/infiniband/hw/hfi1/ipoib.h    |   76 +++++---
 drivers/infiniband/hw/hfi1/ipoib_tx.c |  314 ++++++++++++++-------------------
 drivers/infiniband/hw/hfi1/trace_tx.h |   71 +++++++
 3 files changed, 246 insertions(+), 215 deletions(-)

--
-Denny

Comments

Jason Gunthorpe Sept. 27, 2021, 11:15 p.m. UTC | #1
On Mon, Sep 13, 2021 at 09:28:20AM -0400, Dennis Dalessandro wrote:
> Here is a series of perf improvements and debug/trace fixes from Mike,
> who has this to say about the patches...
> 
> The AIP SDMA interrupt handling is inefficient:
> 
> - A slab entry is allocated for each sent packet
> 
>   This is despite the fact that there is a ring for each possible send slot
>   that could be occupied by a tx descriptor
> 
> - The interrupt handling/NAPI is lock happy has a mixed up notion of
>   producer and consumer
> 
>   The ring should be a ring of tx descriptors vs. a ring of pointers
> 
>   The consumer of descriptors should be the xmit side of the TX
> 
>   The producer of the descriptors is the SDMA interrupt handling and NAPI
>   tx completion
> 
>   There is certainly no locking required in the interrupt/TX napi tx queue
> 
>   There is no locking required in the xmit side since that is held off by NAPI
>   code
> 
> Note that these patches are also staged publicly on our GitHub site for easy
> browsing in context.
> 
> https://github.com/cornelisnetworks/linux
> 
> ---
> 
> Mike Marciniszyn (6):
>       IB/hfi1: Remove cache and embed txreq in ring
>       IB/hfi1: Get rid of hot path divide
>       IB/hfi1: Get rid of tx priv backpointer
>       IB/hfi1: Tune netdev xmit cachelines
>       IB/hfi1: Remove atomic completion count
>       IB/hfi1: Add ring consumer and producers traces

Applied to for-next, thanks

Jason