Message ID | 20241002165954.128085-5-wei.huang2@amd.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | TPH and cache direct injection support | expand |
On Wed, 2 Oct 2024 11:59:53 -0500 Wei Huang wrote: > + if (netif_running(irq->bp->dev)) { > + rtnl_lock(); > + err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr); > + if (err) > + netdev_err(irq->bp->dev, > + "rx queue restart failed: err=%d\n", err); > + rtnl_unlock(); > + } > +} > + > +static void __bnxt_irq_affinity_release(struct kref __always_unused *ref) > +{ > +} An empty release function is always a red flag. How is the reference counting used here? Is irq_set_affinity_notifier() not synchronous? Otherwise the rtnl_lock() should probably cover the running check.
[AMD Official Use Only - AMD Internal Distribution Only] Hello Jakub, Thanks for the feedback. We'll update the patch to cover the code under the rtnl_lock. About the empty function, there are no actions to perform when the driver's notify.release function is called. The IRQ notifier is only registered once and there are no older IRQ notifiers for the driver that could get called back. We also followed the precedent seen from other drivers in the kernel tree that follow the same mechanism . See code: From drivers/net/ethernet/intel/i40e/i40e_main.c static void i40e_irq_affinity_release(struct kref *ref) {} From drivers/net/ethernet/intel/iavf/iavf_main.c static void iavf_irq_affinity_release(struct kref *ref) {} From drivers/net/ethernet/fungible/funeth/funeth_main.c static void fun_irq_aff_release(struct kref __always_unused *ref) { } Thanks Manoj -----Original Message----- From: Jakub Kicinski <kuba@kernel.org> Sent: Tuesday, October 8, 2024 6:40 AM To: Huang2, Wei <Wei.Huang2@amd.com> Cc: linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org; linux-doc@vger.kernel.org; netdev@vger.kernel.org; Jonathan.Cameron@Huawei.com; helgaas@kernel.org; corbet@lwn.net; davem@davemloft.net; edumazet@google.com; pabeni@redhat.com; alex.williamson@redhat.com; gospo@broadcom.com; michael.chan@broadcom.com; ajit.khaparde@broadcom.com; somnath.kotur@broadcom.com; andrew.gospodarek@broadcom.com; Panicker, Manoj <Manoj.Panicker2@amd.com>; VanTassell, Eric <Eric.VanTassell@amd.com>; vadim.fedorenko@linux.dev; horms@kernel.org; bagasdotme@gmail.com; bhelgaas@google.com; lukas@wunner.de; paul.e.luse@intel.com; jing2.liu@intel.com Subject: Re: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. On Wed, 2 Oct 2024 11:59:53 -0500 Wei Huang wrote: > + if (netif_running(irq->bp->dev)) { > + rtnl_lock(); > + err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr); > + if (err) > + netdev_err(irq->bp->dev, > + "rx queue restart failed: err=%d\n", err); > + rtnl_unlock(); > + } > +} > + > +static void __bnxt_irq_affinity_release(struct kref __always_unused > +*ref) { } An empty release function is always a red flag. How is the reference counting used here? Is irq_set_affinity_notifier() not synchronous? Otherwise the rtnl_lock() should probably cover the running check.
[These question are for both Jakub and Bjorn] Any suggestions on how to proceed? I can send out a V8 patchset if Jakub is OK with Manoj's solution? Or only a new patch #4 is needed since the rest are intact. Thanks, -Wei On 10/11/24 13:35, Panicker, Manoj wrote: > [AMD Official Use Only - AMD Internal Distribution Only] > > Hello Jakub, > > Thanks for the feedback. We'll update the patch to cover the code under the rtnl_lock. > > About the empty function, there are no actions to perform when the driver's notify.release function is called. The IRQ notifier is only registered once and there are no older IRQ notifiers for the driver that could get called back. We also followed the precedent seen from other drivers in the kernel tree that follow the same mechanism . > > See code: > From drivers/net/ethernet/intel/i40e/i40e_main.c > static void i40e_irq_affinity_release(struct kref *ref) {} > > > From drivers/net/ethernet/intel/iavf/iavf_main.c > static void iavf_irq_affinity_release(struct kref *ref) {} > > > From drivers/net/ethernet/fungible/funeth/funeth_main.c > static void fun_irq_aff_release(struct kref __always_unused *ref) > { > } > > > Thanks > Manoj > > -----Original Message----- > From: Jakub Kicinski <kuba@kernel.org> > Sent: Tuesday, October 8, 2024 6:40 AM > To: Huang2, Wei <Wei.Huang2@amd.com> > Cc: linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org; linux-doc@vger.kernel.org; netdev@vger.kernel.org; Jonathan.Cameron@Huawei.com; helgaas@kernel.org; corbet@lwn.net; davem@davemloft.net; edumazet@google.com; pabeni@redhat.com; alex.williamson@redhat.com; gospo@broadcom.com; michael.chan@broadcom.com; ajit.khaparde@broadcom.com; somnath.kotur@broadcom.com; andrew.gospodarek@broadcom.com; Panicker, Manoj <Manoj.Panicker2@amd.com>; VanTassell, Eric <Eric.VanTassell@amd.com>; vadim.fedorenko@linux.dev; horms@kernel.org; bagasdotme@gmail.com; bhelgaas@google.com; lukas@wunner.de; paul.e.luse@intel.com; jing2.liu@intel.com > Subject: Re: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > On Wed, 2 Oct 2024 11:59:53 -0500 Wei Huang wrote: >> + if (netif_running(irq->bp->dev)) { >> + rtnl_lock(); >> + err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr); >> + if (err) >> + netdev_err(irq->bp->dev, >> + "rx queue restart failed: err=%d\n", err); >> + rtnl_unlock(); >> + } >> +} >> + >> +static void __bnxt_irq_affinity_release(struct kref __always_unused >> +*ref) { } > > An empty release function is always a red flag. > How is the reference counting used here? > Is irq_set_affinity_notifier() not synchronous? > Otherwise the rtnl_lock() should probably cover the running check.
On Tue, 15 Oct 2024 14:50:39 -0500 Wei Huang wrote: > Any suggestions on how to proceed? I can send out a V8 patchset if Jakub > is OK with Manoj's solution? Or only a new patch #4 is needed since the > rest are intact. 1) y'all need to stop top posting 2) Manoj's reply is AMD internal and I'm not an AMD employee 3) precedent in drivers means relatively little, existing code can be buggy
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 6e422e24750a..23ad2b6e70c7 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -55,6 +55,8 @@ #include <net/page_pool/helpers.h> #include <linux/align.h> #include <net/netdev_queues.h> +#include <net/netdev_rx_queue.h> +#include <linux/pci-tph.h> #include "bnxt_hsi.h" #include "bnxt.h" @@ -10865,6 +10867,61 @@ int bnxt_reserve_rings(struct bnxt *bp, bool irq_re_init) return 0; } +static void __bnxt_irq_affinity_notify(struct irq_affinity_notify *notify, + const cpumask_t *mask) +{ + struct bnxt_irq *irq; + u16 tag; + int err; + + irq = container_of(notify, struct bnxt_irq, affinity_notify); + cpumask_copy(irq->cpu_mask, mask); + + if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM, + cpumask_first(irq->cpu_mask), &tag)) + return; + + if (pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag)) + return; + + if (netif_running(irq->bp->dev)) { + rtnl_lock(); + err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr); + if (err) + netdev_err(irq->bp->dev, + "rx queue restart failed: err=%d\n", err); + rtnl_unlock(); + } +} + +static void __bnxt_irq_affinity_release(struct kref __always_unused *ref) +{ +} + +static void bnxt_release_irq_notifier(struct bnxt_irq *irq) +{ + irq_set_affinity_notifier(irq->vector, NULL); +} + +static void bnxt_register_irq_notifier(struct bnxt *bp, struct bnxt_irq *irq) +{ + struct irq_affinity_notify *notify; + + irq->bp = bp; + + /* Nothing to do if TPH is not enabled */ + if (!bp->tph_mode) + return; + + /* Register IRQ affinity notifier */ + notify = &irq->affinity_notify; + notify->irq = irq->vector; + notify->notify = __bnxt_irq_affinity_notify; + notify->release = __bnxt_irq_affinity_release; + + irq_set_affinity_notifier(irq->vector, notify); +} + static void bnxt_free_irq(struct bnxt *bp) { struct bnxt_irq *irq; @@ -10887,11 +10944,18 @@ static void bnxt_free_irq(struct bnxt *bp) free_cpumask_var(irq->cpu_mask); irq->have_cpumask = 0; } + + bnxt_release_irq_notifier(irq); + free_irq(irq->vector, bp->bnapi[i]); } irq->requested = 0; } + + /* Disable TPH support */ + pcie_disable_tph(bp->pdev); + bp->tph_mode = 0; } static int bnxt_request_irq(struct bnxt *bp) @@ -10911,6 +10975,12 @@ static int bnxt_request_irq(struct bnxt *bp) #ifdef CONFIG_RFS_ACCEL rmap = bp->dev->rx_cpu_rmap; #endif + + /* Enable TPH support as part of IRQ request */ + rc = pcie_enable_tph(bp->pdev, PCI_TPH_ST_IV_MODE); + if (!rc) + bp->tph_mode = PCI_TPH_ST_IV_MODE; + for (i = 0, j = 0; i < bp->cp_nr_rings; i++) { int map_idx = bnxt_cp_num_to_irq_num(bp, i); struct bnxt_irq *irq = &bp->irq_tbl[map_idx]; @@ -10934,8 +11004,11 @@ static int bnxt_request_irq(struct bnxt *bp) if (zalloc_cpumask_var(&irq->cpu_mask, GFP_KERNEL)) { int numa_node = dev_to_node(&bp->pdev->dev); + u16 tag; irq->have_cpumask = 1; + irq->msix_nr = map_idx; + irq->ring_nr = i; cpumask_set_cpu(cpumask_local_spread(i, numa_node), irq->cpu_mask); rc = irq_set_affinity_hint(irq->vector, irq->cpu_mask); @@ -10945,6 +11018,16 @@ static int bnxt_request_irq(struct bnxt *bp) irq->vector); break; } + + bnxt_register_irq_notifier(bp, irq); + + /* Init ST table entry */ + if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM, + cpumask_first(irq->cpu_mask), + &tag)) + continue; + + pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag); } } return rc; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 69231e85140b..641d25646367 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1227,6 +1227,11 @@ struct bnxt_irq { u8 have_cpumask:1; char name[IFNAMSIZ + BNXT_IRQ_NAME_EXTRA]; cpumask_var_t cpu_mask; + + struct bnxt *bp; + int msix_nr; + int ring_nr; + struct irq_affinity_notify affinity_notify; }; #define HWRM_RING_ALLOC_TX 0x1 @@ -2183,6 +2188,8 @@ struct bnxt { struct net_device *dev; struct pci_dev *pdev; + u8 tph_mode; + atomic_t intr_sem; u32 flags; diff --git a/net/core/netdev_rx_queue.c b/net/core/netdev_rx_queue.c index e217a5838c87..10e95d7b6892 100644 --- a/net/core/netdev_rx_queue.c +++ b/net/core/netdev_rx_queue.c @@ -79,3 +79,4 @@ int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq_idx) return err; } +EXPORT_SYMBOL_GPL(netdev_rx_queue_restart);