Message ID | 20240409215431.41424-6-michael.chan@broadcom.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 2e4592dc9bee5ca4fd3da4c5451c7f97c76442bf |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | bnxt_en: Updates for net-next | expand |
On 4/9/2024 2:54 PM, Michael Chan wrote: > From: Vikas Gupta <vikas.gupta@broadcom.com> > > The existing scheme sets aside a number of MSIX/NQs for the RoCE > driver whether the RoCE driver is registered or not. This scheme > is not flexible and limits the resources available for the L2 rings > if RoCE is never used. > > Modify the scheme so that the RoCE MSIX/NQs can be used by the L2 > driver if they are not used for RoCE. The MSIX/NQs are now > represented by 3 fields. bp->ulp_num_msix_want contains the > desired default value, edev->ulp_num_msix_vec contains the > available value (but not necessarily in use), and > ulp_tbl->msix_requested contains the actual value in use by RoCE. > > The L2 driver can dip into edev->ulp_num_msix_vec if necessary. > > We need to add rtnl_lock() back in bnxt_register_dev() and > bnxt_unregister_dev() to synchronize the MSIX usage between L2 and > RoCE. > > Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> > Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> > Signed-off-by: Michael Chan <michael.chan@broadcom.com> > --- Whats the behavior if the L2 driver dips into this pool and then RoCE is enabled later? I guess RoCE would fail to get the resources it needs, but then system administrator could re-configure the L2 device to use fewer resources? Makes sense. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Thanks, Jake
On Tue, Apr 9, 2024 at 4:40 PM Jacob Keller <jacob.e.keller@intel.com> wrote: > > > > On 4/9/2024 2:54 PM, Michael Chan wrote: > > From: Vikas Gupta <vikas.gupta@broadcom.com> > > > > The existing scheme sets aside a number of MSIX/NQs for the RoCE > > driver whether the RoCE driver is registered or not. This scheme > > is not flexible and limits the resources available for the L2 rings > > if RoCE is never used. > > > > Modify the scheme so that the RoCE MSIX/NQs can be used by the L2 > > driver if they are not used for RoCE. The MSIX/NQs are now > > represented by 3 fields. bp->ulp_num_msix_want contains the > > desired default value, edev->ulp_num_msix_vec contains the > > available value (but not necessarily in use), and > > ulp_tbl->msix_requested contains the actual value in use by RoCE. > > > > The L2 driver can dip into edev->ulp_num_msix_vec if necessary. > > > > We need to add rtnl_lock() back in bnxt_register_dev() and > > bnxt_unregister_dev() to synchronize the MSIX usage between L2 and > > RoCE. > > > > Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> > > Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> > > Signed-off-by: Michael Chan <michael.chan@broadcom.com> > > --- > > Whats the behavior if the L2 driver dips into this pool and then RoCE is > enabled later? Thanks for the review. If the user increases the L2 rings or enables XDP which will cause the driver to allocate a new set of XDP TX rings, it can now use the RoCE MSIX if needed and if they are not in use. > > I guess RoCE would fail to get the resources it needs, but then system > administrator could re-configure the L2 device to use fewer resources? If the above simply reduces the RoCE MSIX, the RoCE driver can still operate with fewer MSIX. If the above has reduced the MSIX below the minimum required for RoCE, then RoCE will fail to initialize. At that point, the user can reduce the L2 rings and reload the RoCE driver.
On 4/9/2024 4:48 PM, Michael Chan wrote: > On Tue, Apr 9, 2024 at 4:40 PM Jacob Keller <jacob.e.keller@intel.com> wrote: >> >> >> >> On 4/9/2024 2:54 PM, Michael Chan wrote: >>> From: Vikas Gupta <vikas.gupta@broadcom.com> >>> >>> The existing scheme sets aside a number of MSIX/NQs for the RoCE >>> driver whether the RoCE driver is registered or not. This scheme >>> is not flexible and limits the resources available for the L2 rings >>> if RoCE is never used. >>> >>> Modify the scheme so that the RoCE MSIX/NQs can be used by the L2 >>> driver if they are not used for RoCE. The MSIX/NQs are now >>> represented by 3 fields. bp->ulp_num_msix_want contains the >>> desired default value, edev->ulp_num_msix_vec contains the >>> available value (but not necessarily in use), and >>> ulp_tbl->msix_requested contains the actual value in use by RoCE. >>> >>> The L2 driver can dip into edev->ulp_num_msix_vec if necessary. >>> >>> We need to add rtnl_lock() back in bnxt_register_dev() and >>> bnxt_unregister_dev() to synchronize the MSIX usage between L2 and >>> RoCE. >>> >>> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> >>> Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> >>> Signed-off-by: Michael Chan <michael.chan@broadcom.com> >>> --- >> >> Whats the behavior if the L2 driver dips into this pool and then RoCE is >> enabled later? > > Thanks for the review. If the user increases the L2 rings or enables > XDP which will cause the driver to allocate a new set of XDP TX rings, > it can now use the RoCE MSIX if needed and if they are not in use. > >> >> I guess RoCE would fail to get the resources it needs, but then system >> administrator could re-configure the L2 device to use fewer resources? > > If the above simply reduces the RoCE MSIX, the RoCE driver can still > operate with fewer MSIX. If the above has reduced the MSIX below the > minimum required for RoCE, then RoCE will fail to initialize. At that > point, the user can reduce the L2 rings and reload the RoCE driver. Sensible behavior. Thanks for the detailed explanation -Jake
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 8c24e83ba050..88cf8f47e071 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -7549,6 +7549,20 @@ static int __bnxt_reserve_rings(struct bnxt *bp) if (!netif_is_rxfh_configured(bp->dev)) bnxt_set_dflt_rss_indir_tbl(bp, NULL); + if (!bnxt_ulp_registered(bp->edev) && BNXT_NEW_RM(bp)) { + int resv_msix, resv_ctx, ulp_msix, ulp_ctxs; + struct bnxt_hw_resc *hw_resc; + + hw_resc = &bp->hw_resc; + resv_msix = hw_resc->resv_irqs - bp->cp_nr_rings; + ulp_msix = bnxt_get_ulp_msix_num(bp); + ulp_msix = min_t(int, resv_msix, ulp_msix); + bnxt_set_ulp_msix_num(bp, ulp_msix); + resv_ctx = hw_resc->resv_stat_ctxs - bp->cp_nr_rings; + ulp_ctxs = min(resv_ctx, bnxt_get_ulp_stat_ctxs(bp)); + bnxt_set_ulp_stat_ctxs(bp, ulp_ctxs); + } + return rc; } @@ -14982,6 +14996,7 @@ static void bnxt_trim_dflt_sh_rings(struct bnxt *bp) static int bnxt_set_dflt_rings(struct bnxt *bp, bool sh) { int dflt_rings, max_rx_rings, max_tx_rings, rc; + int avail_msix; if (!bnxt_can_reserve_rings(bp)) return 0; @@ -15009,6 +15024,14 @@ static int bnxt_set_dflt_rings(struct bnxt *bp, bool sh) bp->cp_nr_rings = bp->tx_nr_rings_per_tc + bp->rx_nr_rings; bp->tx_nr_rings = bp->tx_nr_rings_per_tc; + avail_msix = bnxt_get_max_func_irqs(bp) - bp->cp_nr_rings; + if (avail_msix >= BNXT_MIN_ROCE_CP_RINGS) { + int ulp_num_msix = min(avail_msix, bp->ulp_num_msix_want); + + bnxt_set_ulp_msix_num(bp, ulp_num_msix); + bnxt_set_dflt_ulp_stat_ctxs(bp); + } + rc = __bnxt_reserve_rings(bp); if (rc && rc != -ENODEV) netdev_warn(bp->dev, "Unable to reserve tx rings\n"); @@ -15358,6 +15381,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) bnxt_set_rx_skb_mode(bp, false); bnxt_set_tpa_flags(bp); bnxt_set_ring_params(bp); + bnxt_rdma_aux_device_init(bp); rc = bnxt_set_dflt_rings(bp, true); if (rc) { if (BNXT_VF(bp) && rc == -ENODEV) { @@ -15411,7 +15435,6 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) if (BNXT_SUPPORTS_NTUPLE_VNIC(bp)) bnxt_init_multi_rss_ctx(bp); - bnxt_rdma_aux_device_init(bp); rc = register_netdev(dev); if (rc) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 0640fcb57ef8..ad57ef051798 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -2303,6 +2303,7 @@ struct bnxt { struct bnxt_irq *irq_tbl; int total_irqs; + int ulp_num_msix_want; u8 mac_addr[ETH_ALEN]; #ifdef CONFIG_BNXT_DCB diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c index ae2b367b1226..de2cb1d4cd98 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c @@ -48,6 +48,46 @@ static void bnxt_fill_msix_vecs(struct bnxt *bp, struct bnxt_msix_entry *ent) } } +int bnxt_get_ulp_msix_num(struct bnxt *bp) +{ + if (bp->edev) + return bp->edev->ulp_num_msix_vec; + return 0; +} + +void bnxt_set_ulp_msix_num(struct bnxt *bp, int num) +{ + if (bp->edev) + bp->edev->ulp_num_msix_vec = num; +} + +int bnxt_get_ulp_stat_ctxs(struct bnxt *bp) +{ + if (bp->edev) + return bp->edev->ulp_num_ctxs; + return 0; +} + +void bnxt_set_ulp_stat_ctxs(struct bnxt *bp, int num_ulp_ctx) +{ + if (bp->edev) + bp->edev->ulp_num_ctxs = num_ulp_ctx; +} + +void bnxt_set_dflt_ulp_stat_ctxs(struct bnxt *bp) +{ + if (bp->edev) { + bp->edev->ulp_num_ctxs = BNXT_MIN_ROCE_STAT_CTXS; + /* Reserve one additional stat_ctx for PF0 (except + * on 1-port NICs) as it also creates one stat_ctx + * for PF1 in case of RoCE bonding. + */ + if (BNXT_PF(bp) && !bp->pf.port_id && + bp->port_count > 1) + bp->edev->ulp_num_ctxs++; + } +} + int bnxt_register_dev(struct bnxt_en_dev *edev, struct bnxt_ulp_ops *ulp_ops, void *handle) @@ -56,11 +96,19 @@ int bnxt_register_dev(struct bnxt_en_dev *edev, struct bnxt *bp = netdev_priv(dev); unsigned int max_stat_ctxs; struct bnxt_ulp *ulp; + int rc = 0; + rtnl_lock(); + if (!bp->irq_tbl) { + rc = -ENODEV; + goto exit; + } max_stat_ctxs = bnxt_get_max_func_stat_ctxs(bp); if (max_stat_ctxs <= BNXT_MIN_ROCE_STAT_CTXS || - bp->cp_nr_rings == max_stat_ctxs) - return -ENOMEM; + bp->cp_nr_rings == max_stat_ctxs) { + rc = -ENOMEM; + goto exit; + } ulp = edev->ulp_tbl; ulp->handle = handle; @@ -69,9 +117,13 @@ int bnxt_register_dev(struct bnxt_en_dev *edev, if (test_bit(BNXT_STATE_OPEN, &bp->state)) bnxt_hwrm_vnic_cfg(bp, &bp->vnic_info[BNXT_VNIC_DEFAULT]); + edev->ulp_tbl->msix_requested = bnxt_get_ulp_msix_num(bp); + bnxt_fill_msix_vecs(bp, bp->edev->msix_entries); edev->flags |= BNXT_EN_FLAG_MSIX_REQUESTED; - return 0; +exit: + rtnl_unlock(); + return rc; } EXPORT_SYMBOL(bnxt_register_dev); @@ -83,8 +135,10 @@ void bnxt_unregister_dev(struct bnxt_en_dev *edev) int i = 0; ulp = edev->ulp_tbl; + rtnl_lock(); if (ulp->msix_requested) edev->flags &= ~BNXT_EN_FLAG_MSIX_REQUESTED; + edev->ulp_tbl->msix_requested = 0; if (ulp->max_async_event_id) bnxt_hwrm_func_drv_rgtr(bp, NULL, 0, true); @@ -97,11 +151,12 @@ void bnxt_unregister_dev(struct bnxt_en_dev *edev) msleep(100); i++; } + rtnl_unlock(); return; } EXPORT_SYMBOL(bnxt_unregister_dev); -int bnxt_get_ulp_msix_num(struct bnxt *bp) +static int bnxt_set_dflt_ulp_msix(struct bnxt *bp) { u32 roce_msix = BNXT_VF(bp) ? BNXT_MAX_VF_ROCE_MSIX : BNXT_MAX_ROCE_MSIX; @@ -110,18 +165,6 @@ int bnxt_get_ulp_msix_num(struct bnxt *bp) min_t(u32, roce_msix, num_online_cpus()) : 0); } -int bnxt_get_ulp_stat_ctxs(struct bnxt *bp) -{ - if (bnxt_ulp_registered(bp->edev)) { - struct bnxt_en_dev *edev = bp->edev; - - if (edev->ulp_tbl->msix_requested) - return BNXT_MIN_ROCE_STAT_CTXS; - } - - return 0; -} - int bnxt_send_msg(struct bnxt_en_dev *edev, struct bnxt_fw_msg *fw_msg) { @@ -336,7 +379,6 @@ static void bnxt_set_edev_info(struct bnxt_en_dev *edev, struct bnxt *bp) edev->pf_port_id = bp->pf.port_id; edev->en_state = bp->state; edev->bar0 = bp->bar0; - edev->ulp_tbl->msix_requested = bnxt_get_ulp_msix_num(bp); } void bnxt_rdma_aux_device_add(struct bnxt *bp) @@ -409,6 +451,7 @@ void bnxt_rdma_aux_device_init(struct bnxt *bp) aux_priv->edev = edev; bp->edev = edev; bnxt_set_edev_info(edev, bp); + bp->ulp_num_msix_want = bnxt_set_dflt_ulp_msix(bp); return; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h index ae7266ddf167..04ce3328e66f 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h @@ -85,17 +85,23 @@ struct bnxt_en_dev { * updated in resume. */ void __iomem *bar0; + + u16 ulp_num_msix_vec; + u16 ulp_num_ctxs; }; static inline bool bnxt_ulp_registered(struct bnxt_en_dev *edev) { - if (edev && edev->ulp_tbl) + if (edev && rcu_access_pointer(edev->ulp_tbl->ulp_ops)) return true; return false; } int bnxt_get_ulp_msix_num(struct bnxt *bp); +void bnxt_set_ulp_msix_num(struct bnxt *bp, int num); int bnxt_get_ulp_stat_ctxs(struct bnxt *bp); +void bnxt_set_ulp_stat_ctxs(struct bnxt *bp, int num_ctxs); +void bnxt_set_dflt_ulp_stat_ctxs(struct bnxt *bp); void bnxt_ulp_stop(struct bnxt *bp); void bnxt_ulp_start(struct bnxt *bp, int err); void bnxt_ulp_sriov_cfg(struct bnxt *bp, int num_vfs);