Message ID | 20241028195243.52488-3-jdamato@fastly.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | igc: Link IRQs and queues to NAPIs | expand |
On 10/28/2024 9:52 PM, Joe Damato wrote: > Link queues to NAPI instances via netdev-genl API so that users can > query this information with netlink. Handle a few cases in the driver: > 1. Link/unlink the NAPIs when XDP is enabled/disabled > 2. Handle IGC_FLAG_QUEUE_PAIRS enabled and disabled > > Example output when IGC_FLAG_QUEUE_PAIRS is enabled: > > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ > --dump queue-get --json='{"ifindex": 2}' > > [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'}, > {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'}, > {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'}, > {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'}, > {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'}, > {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'tx'}, > {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'}, > {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}] > > Since IGC_FLAG_QUEUE_PAIRS is enabled, you'll note that the same NAPI ID > is present for both rx and tx queues at the same index, for example > index 0: > > {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'}, > {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'}, > > To test IGC_FLAG_QUEUE_PAIRS disabled, a test system was booted using > the grub command line option "maxcpus=2" to force > igc_set_interrupt_capability to disable IGC_FLAG_QUEUE_PAIRS. > > Example output when IGC_FLAG_QUEUE_PAIRS is disabled: > > $ lscpu | grep "On-line CPU" > On-line CPU(s) list: 0,2 > > $ ethtool -l enp86s0 | tail -5 > Current hardware settings: > RX: n/a > TX: n/a > Other: 1 > Combined: 2 > > $ cat /proc/interrupts | grep enp > 144: [...] enp86s0 > 145: [...] enp86s0-rx-0 > 146: [...] enp86s0-rx-1 > 147: [...] enp86s0-tx-0 > 148: [...] enp86s0-tx-1 > > 1 "other" IRQ, and 2 IRQs for each of RX and Tx, so we expect netlink to > report 4 IRQs with unique NAPI IDs: > > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ > --dump napi-get --json='{"ifindex": 2}' > [{'id': 8196, 'ifindex': 2, 'irq': 148}, > {'id': 8195, 'ifindex': 2, 'irq': 147}, > {'id': 8194, 'ifindex': 2, 'irq': 146}, > {'id': 8193, 'ifindex': 2, 'irq': 145}] > > Now we examine which queues these NAPIs are associated with, expecting > that since IGC_FLAG_QUEUE_PAIRS is disabled each RX and TX queue will > have its own NAPI instance: > > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ > --dump queue-get --json='{"ifindex": 2}' > [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'}, > {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'}, > {'id': 0, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'}, > {'id': 1, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}] > > Signed-off-by: Joe Damato <jdamato@fastly.com> > --- > v5: > - Rename igc_resume to __igc_do_resume and pass in a boolean > "need_rtnl" to signal whether or not rtnl should be held before > caling __igc_open. Call this new function from igc_runtime_resume > and igc_resume passing in false (for igc_runtime_resume) and true > (igc_resume), respectively. This is done to avoid reintroducing a > bug fixed in commit: 6f31d6b: "igc: Refactor runtime power > management flow" where rtnl is held in runtime_resume causing a > deadlock. > > v4: > - Add rtnl_lock/rtnl_unlock in two paths: igc_resume and > igc_io_error_detected. The code added to the latter is inspired by > a similar implementation in ixgbe's ixgbe_io_error_detected. > > v3: > - Replace igc_unset_queue_napi with igc_set_queue_napi(adapater, i, > NULL), as suggested by Vinicius Costa Gomes > - Simplify implemention of igc_set_queue_napi as suggested by Kurt > Kanzenbach, with a tweak to use ring->queue_index > > v2: > - Update commit message to include tests for IGC_FLAG_QUEUE_PAIRS > disabled > - Refactored code to move napi queue mapping and unmapping to helper > functions igc_set_queue_napi and igc_unset_queue_napi > - Adjust the code to handle IGC_FLAG_QUEUE_PAIRS disabled > - Call helpers to map/unmap queues to NAPIs in igc_up, __igc_open, > igc_xdp_enable_pool, and igc_xdp_disable_pool > > drivers/net/ethernet/intel/igc/igc.h | 2 + > drivers/net/ethernet/intel/igc/igc_main.c | 52 ++++++++++++++++++++--- > drivers/net/ethernet/intel/igc/igc_xdp.c | 2 + > 3 files changed, 49 insertions(+), 7 deletions(-) > > diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h > index eac0f966e0e4..b8111ad9a9a8 100644 > --- a/drivers/net/ethernet/intel/igc/igc.h > +++ b/drivers/net/ethernet/intel/igc/igc.h > @@ -337,6 +337,8 @@ struct igc_adapter { > struct igc_led_classdev *leds; > }; > > +void igc_set_queue_napi(struct igc_adapter *adapter, int q_idx, > + struct napi_struct *napi); > void igc_up(struct igc_adapter *adapter); > void igc_down(struct igc_adapter *adapter); > int igc_open(struct net_device *netdev); > diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c > index 7964bbedb16c..051a0cdb1143 100644 > --- a/drivers/net/ethernet/intel/igc/igc_main.c > +++ b/drivers/net/ethernet/intel/igc/igc_main.c > @@ -4948,6 +4948,22 @@ static int igc_sw_init(struct igc_adapter *adapter) > return 0; > } > > +void igc_set_queue_napi(struct igc_adapter *adapter, int vector, > + struct napi_struct *napi) > +{ > + struct igc_q_vector *q_vector = adapter->q_vector[vector]; > + > + if (q_vector->rx.ring) > + netif_queue_set_napi(adapter->netdev, > + q_vector->rx.ring->queue_index, > + NETDEV_QUEUE_TYPE_RX, napi); > + > + if (q_vector->tx.ring) > + netif_queue_set_napi(adapter->netdev, > + q_vector->tx.ring->queue_index, > + NETDEV_QUEUE_TYPE_TX, napi); > +} > + > /** > * igc_up - Open the interface and prepare it to handle traffic > * @adapter: board private structure > @@ -4955,6 +4971,7 @@ static int igc_sw_init(struct igc_adapter *adapter) > void igc_up(struct igc_adapter *adapter) > { > struct igc_hw *hw = &adapter->hw; > + struct napi_struct *napi; > int i = 0; > > /* hardware has been reset, we need to reload some things */ > @@ -4962,8 +4979,11 @@ void igc_up(struct igc_adapter *adapter) > > clear_bit(__IGC_DOWN, &adapter->state); > > - for (i = 0; i < adapter->num_q_vectors; i++) > - napi_enable(&adapter->q_vector[i]->napi); > + for (i = 0; i < adapter->num_q_vectors; i++) { > + napi = &adapter->q_vector[i]->napi; > + napi_enable(napi); > + igc_set_queue_napi(adapter, i, napi); > + } > > if (adapter->msix_entries) > igc_configure_msix(adapter); > @@ -5192,6 +5212,7 @@ void igc_down(struct igc_adapter *adapter) > for (i = 0; i < adapter->num_q_vectors; i++) { > if (adapter->q_vector[i]) { > napi_synchronize(&adapter->q_vector[i]->napi); > + igc_set_queue_napi(adapter, i, NULL); > napi_disable(&adapter->q_vector[i]->napi); > } > } > @@ -6021,6 +6042,7 @@ static int __igc_open(struct net_device *netdev, bool resuming) > struct igc_adapter *adapter = netdev_priv(netdev); > struct pci_dev *pdev = adapter->pdev; > struct igc_hw *hw = &adapter->hw; > + struct napi_struct *napi; > int err = 0; > int i = 0; > > @@ -6056,8 +6078,11 @@ static int __igc_open(struct net_device *netdev, bool resuming) > > clear_bit(__IGC_DOWN, &adapter->state); > > - for (i = 0; i < adapter->num_q_vectors; i++) > - napi_enable(&adapter->q_vector[i]->napi); > + for (i = 0; i < adapter->num_q_vectors; i++) { > + napi = &adapter->q_vector[i]->napi; > + napi_enable(napi); > + igc_set_queue_napi(adapter, i, napi); > + } > > /* Clear any pending interrupts. */ > rd32(IGC_ICR); > @@ -7342,7 +7367,7 @@ static void igc_deliver_wake_packet(struct net_device *netdev) > netif_rx(skb); > } > > -static int igc_resume(struct device *dev) > +static int __igc_do_resume(struct device *dev, bool need_rtnl) > { > struct pci_dev *pdev = to_pci_dev(dev); > struct net_device *netdev = pci_get_drvdata(pdev); > @@ -7385,7 +7410,11 @@ static int igc_resume(struct device *dev) > wr32(IGC_WUS, ~0); > > if (netif_running(netdev)) { > + if (need_rtnl) > + rtnl_lock(); > err = __igc_open(netdev, true); > + if (need_rtnl) > + rtnl_unlock(); > if (!err) > netif_device_attach(netdev); > } > @@ -7393,9 +7422,14 @@ static int igc_resume(struct device *dev) > return err; > } > > +static int igc_resume(struct device *dev) > +{ > + return __igc_do_resume(dev, true); > +} > + > static int igc_runtime_resume(struct device *dev) > { > - return igc_resume(dev); > + return __igc_do_resume(dev, false); > } > > static int igc_suspend(struct device *dev) > @@ -7440,14 +7474,18 @@ static pci_ers_result_t igc_io_error_detected(struct pci_dev *pdev, > struct net_device *netdev = pci_get_drvdata(pdev); > struct igc_adapter *adapter = netdev_priv(netdev); > > + rtnl_lock(); > netif_device_detach(netdev); > > - if (state == pci_channel_io_perm_failure) > + if (state == pci_channel_io_perm_failure) { > + rtnl_unlock(); > return PCI_ERS_RESULT_DISCONNECT; > + } > > if (netif_running(netdev)) > igc_down(adapter); > pci_disable_device(pdev); > + rtnl_unlock(); > > /* Request a slot reset. */ > return PCI_ERS_RESULT_NEED_RESET; > diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c > index e27af72aada8..4da633430b80 100644 > --- a/drivers/net/ethernet/intel/igc/igc_xdp.c > +++ b/drivers/net/ethernet/intel/igc/igc_xdp.c > @@ -84,6 +84,7 @@ static int igc_xdp_enable_pool(struct igc_adapter *adapter, > napi_disable(napi); > } > > + igc_set_queue_napi(adapter, queue_id, NULL); > set_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags); > set_bit(IGC_RING_FLAG_AF_XDP_ZC, &tx_ring->flags); > > @@ -133,6 +134,7 @@ static int igc_xdp_disable_pool(struct igc_adapter *adapter, u16 queue_id) > xsk_pool_dma_unmap(pool, IGC_RX_DMA_ATTR); > clear_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags); > clear_bit(IGC_RING_FLAG_AF_XDP_ZC, &tx_ring->flags); > + igc_set_queue_napi(adapter, queue_id, napi); > > if (needs_reset) { > napi_enable(napi); I believe that this fix should work on most cases. I have some concerns that this solution might not be 100% robust as sometimes runtime resume may be triggered without the rtnl being held. For example, if it is initiated by a network wake event. But, for the moment I think that this appoach is good enough. My main comment here is the naming conventions, I prefer using the original parameters/function names for consistency, similarly to what was done in the igb driver: https://github.com/torvalds/linux/commit/ac8c58f5b535d6272324e2b8b4a0454781c9147e
On Tue, Oct 29, 2024 at 11:49:03AM +0200, Lifshits, Vitaly wrote: > > > On 10/28/2024 9:52 PM, Joe Damato wrote: > > Link queues to NAPI instances via netdev-genl API so that users can > > query this information with netlink. Handle a few cases in the driver: > > 1. Link/unlink the NAPIs when XDP is enabled/disabled > > 2. Handle IGC_FLAG_QUEUE_PAIRS enabled and disabled > > > > Example output when IGC_FLAG_QUEUE_PAIRS is enabled: > > > > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ > > --dump queue-get --json='{"ifindex": 2}' > > > > [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'}, > > {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'}, > > {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'}, > > {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'}, > > {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'}, > > {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'tx'}, > > {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'}, > > {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}] > > > > Since IGC_FLAG_QUEUE_PAIRS is enabled, you'll note that the same NAPI ID > > is present for both rx and tx queues at the same index, for example > > index 0: > > > > {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'}, > > {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'}, > > > > To test IGC_FLAG_QUEUE_PAIRS disabled, a test system was booted using > > the grub command line option "maxcpus=2" to force > > igc_set_interrupt_capability to disable IGC_FLAG_QUEUE_PAIRS. > > > > Example output when IGC_FLAG_QUEUE_PAIRS is disabled: > > > > $ lscpu | grep "On-line CPU" > > On-line CPU(s) list: 0,2 > > > > $ ethtool -l enp86s0 | tail -5 > > Current hardware settings: > > RX: n/a > > TX: n/a > > Other: 1 > > Combined: 2 > > > > $ cat /proc/interrupts | grep enp > > 144: [...] enp86s0 > > 145: [...] enp86s0-rx-0 > > 146: [...] enp86s0-rx-1 > > 147: [...] enp86s0-tx-0 > > 148: [...] enp86s0-tx-1 > > > > 1 "other" IRQ, and 2 IRQs for each of RX and Tx, so we expect netlink to > > report 4 IRQs with unique NAPI IDs: > > > > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ > > --dump napi-get --json='{"ifindex": 2}' > > [{'id': 8196, 'ifindex': 2, 'irq': 148}, > > {'id': 8195, 'ifindex': 2, 'irq': 147}, > > {'id': 8194, 'ifindex': 2, 'irq': 146}, > > {'id': 8193, 'ifindex': 2, 'irq': 145}] > > > > Now we examine which queues these NAPIs are associated with, expecting > > that since IGC_FLAG_QUEUE_PAIRS is disabled each RX and TX queue will > > have its own NAPI instance: > > > > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ > > --dump queue-get --json='{"ifindex": 2}' > > [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'}, > > {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'}, > > {'id': 0, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'}, > > {'id': 1, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}] > > > > Signed-off-by: Joe Damato <jdamato@fastly.com> > > --- > > v5: > > - Rename igc_resume to __igc_do_resume and pass in a boolean > > "need_rtnl" to signal whether or not rtnl should be held before > > caling __igc_open. Call this new function from igc_runtime_resume > > and igc_resume passing in false (for igc_runtime_resume) and true > > (igc_resume), respectively. This is done to avoid reintroducing a > > bug fixed in commit: 6f31d6b: "igc: Refactor runtime power > > management flow" where rtnl is held in runtime_resume causing a > > deadlock. > > > > v4: > > - Add rtnl_lock/rtnl_unlock in two paths: igc_resume and > > igc_io_error_detected. The code added to the latter is inspired by > > a similar implementation in ixgbe's ixgbe_io_error_detected. > > > > v3: > > - Replace igc_unset_queue_napi with igc_set_queue_napi(adapater, i, > > NULL), as suggested by Vinicius Costa Gomes > > - Simplify implemention of igc_set_queue_napi as suggested by Kurt > > Kanzenbach, with a tweak to use ring->queue_index > > > > v2: > > - Update commit message to include tests for IGC_FLAG_QUEUE_PAIRS > > disabled > > - Refactored code to move napi queue mapping and unmapping to helper > > functions igc_set_queue_napi and igc_unset_queue_napi > > - Adjust the code to handle IGC_FLAG_QUEUE_PAIRS disabled > > - Call helpers to map/unmap queues to NAPIs in igc_up, __igc_open, > > igc_xdp_enable_pool, and igc_xdp_disable_pool > > > > drivers/net/ethernet/intel/igc/igc.h | 2 + > > drivers/net/ethernet/intel/igc/igc_main.c | 52 ++++++++++++++++++++--- > > drivers/net/ethernet/intel/igc/igc_xdp.c | 2 + > > 3 files changed, 49 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h > > index eac0f966e0e4..b8111ad9a9a8 100644 > > --- a/drivers/net/ethernet/intel/igc/igc.h > > +++ b/drivers/net/ethernet/intel/igc/igc.h > > @@ -337,6 +337,8 @@ struct igc_adapter { > > struct igc_led_classdev *leds; > > }; > > +void igc_set_queue_napi(struct igc_adapter *adapter, int q_idx, > > + struct napi_struct *napi); > > void igc_up(struct igc_adapter *adapter); > > void igc_down(struct igc_adapter *adapter); > > int igc_open(struct net_device *netdev); > > diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c > > index 7964bbedb16c..051a0cdb1143 100644 > > --- a/drivers/net/ethernet/intel/igc/igc_main.c > > +++ b/drivers/net/ethernet/intel/igc/igc_main.c > > @@ -4948,6 +4948,22 @@ static int igc_sw_init(struct igc_adapter *adapter) > > return 0; > > } > > +void igc_set_queue_napi(struct igc_adapter *adapter, int vector, > > + struct napi_struct *napi) > > +{ > > + struct igc_q_vector *q_vector = adapter->q_vector[vector]; > > + > > + if (q_vector->rx.ring) > > + netif_queue_set_napi(adapter->netdev, > > + q_vector->rx.ring->queue_index, > > + NETDEV_QUEUE_TYPE_RX, napi); > > + > > + if (q_vector->tx.ring) > > + netif_queue_set_napi(adapter->netdev, > > + q_vector->tx.ring->queue_index, > > + NETDEV_QUEUE_TYPE_TX, napi); > > +} > > + > > /** > > * igc_up - Open the interface and prepare it to handle traffic > > * @adapter: board private structure > > @@ -4955,6 +4971,7 @@ static int igc_sw_init(struct igc_adapter *adapter) > > void igc_up(struct igc_adapter *adapter) > > { > > struct igc_hw *hw = &adapter->hw; > > + struct napi_struct *napi; > > int i = 0; > > /* hardware has been reset, we need to reload some things */ > > @@ -4962,8 +4979,11 @@ void igc_up(struct igc_adapter *adapter) > > clear_bit(__IGC_DOWN, &adapter->state); > > - for (i = 0; i < adapter->num_q_vectors; i++) > > - napi_enable(&adapter->q_vector[i]->napi); > > + for (i = 0; i < adapter->num_q_vectors; i++) { > > + napi = &adapter->q_vector[i]->napi; > > + napi_enable(napi); > > + igc_set_queue_napi(adapter, i, napi); > > + } > > if (adapter->msix_entries) > > igc_configure_msix(adapter); > > @@ -5192,6 +5212,7 @@ void igc_down(struct igc_adapter *adapter) > > for (i = 0; i < adapter->num_q_vectors; i++) { > > if (adapter->q_vector[i]) { > > napi_synchronize(&adapter->q_vector[i]->napi); > > + igc_set_queue_napi(adapter, i, NULL); > > napi_disable(&adapter->q_vector[i]->napi); > > } > > } > > @@ -6021,6 +6042,7 @@ static int __igc_open(struct net_device *netdev, bool resuming) > > struct igc_adapter *adapter = netdev_priv(netdev); > > struct pci_dev *pdev = adapter->pdev; > > struct igc_hw *hw = &adapter->hw; > > + struct napi_struct *napi; > > int err = 0; > > int i = 0; > > @@ -6056,8 +6078,11 @@ static int __igc_open(struct net_device *netdev, bool resuming) > > clear_bit(__IGC_DOWN, &adapter->state); > > - for (i = 0; i < adapter->num_q_vectors; i++) > > - napi_enable(&adapter->q_vector[i]->napi); > > + for (i = 0; i < adapter->num_q_vectors; i++) { > > + napi = &adapter->q_vector[i]->napi; > > + napi_enable(napi); > > + igc_set_queue_napi(adapter, i, napi); > > + } > > /* Clear any pending interrupts. */ > > rd32(IGC_ICR); > > @@ -7342,7 +7367,7 @@ static void igc_deliver_wake_packet(struct net_device *netdev) > > netif_rx(skb); > > } > > -static int igc_resume(struct device *dev) > > +static int __igc_do_resume(struct device *dev, bool need_rtnl) > > { > > struct pci_dev *pdev = to_pci_dev(dev); > > struct net_device *netdev = pci_get_drvdata(pdev); > > @@ -7385,7 +7410,11 @@ static int igc_resume(struct device *dev) > > wr32(IGC_WUS, ~0); > > if (netif_running(netdev)) { > > + if (need_rtnl) > > + rtnl_lock(); > > err = __igc_open(netdev, true); > > + if (need_rtnl) > > + rtnl_unlock(); > > if (!err) > > netif_device_attach(netdev); > > } > > @@ -7393,9 +7422,14 @@ static int igc_resume(struct device *dev) > > return err; > > } > > +static int igc_resume(struct device *dev) > > +{ > > + return __igc_do_resume(dev, true); > > +} > > + > > static int igc_runtime_resume(struct device *dev) > > { > > - return igc_resume(dev); > > + return __igc_do_resume(dev, false); > > } > > static int igc_suspend(struct device *dev) > > @@ -7440,14 +7474,18 @@ static pci_ers_result_t igc_io_error_detected(struct pci_dev *pdev, > > struct net_device *netdev = pci_get_drvdata(pdev); > > struct igc_adapter *adapter = netdev_priv(netdev); > > + rtnl_lock(); > > netif_device_detach(netdev); > > - if (state == pci_channel_io_perm_failure) > > + if (state == pci_channel_io_perm_failure) { > > + rtnl_unlock(); > > return PCI_ERS_RESULT_DISCONNECT; > > + } > > if (netif_running(netdev)) > > igc_down(adapter); > > pci_disable_device(pdev); > > + rtnl_unlock(); > > /* Request a slot reset. */ > > return PCI_ERS_RESULT_NEED_RESET; > > diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c > > index e27af72aada8..4da633430b80 100644 > > --- a/drivers/net/ethernet/intel/igc/igc_xdp.c > > +++ b/drivers/net/ethernet/intel/igc/igc_xdp.c > > @@ -84,6 +84,7 @@ static int igc_xdp_enable_pool(struct igc_adapter *adapter, > > napi_disable(napi); > > } > > + igc_set_queue_napi(adapter, queue_id, NULL); > > set_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags); > > set_bit(IGC_RING_FLAG_AF_XDP_ZC, &tx_ring->flags); > > @@ -133,6 +134,7 @@ static int igc_xdp_disable_pool(struct igc_adapter *adapter, u16 queue_id) > > xsk_pool_dma_unmap(pool, IGC_RX_DMA_ATTR); > > clear_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags); > > clear_bit(IGC_RING_FLAG_AF_XDP_ZC, &tx_ring->flags); > > + igc_set_queue_napi(adapter, queue_id, napi); > > if (needs_reset) { > > napi_enable(napi); > > I believe that this fix should work on most cases. I have some concerns that > this solution might not be 100% robust as sometimes runtime resume may be > triggered without the rtnl being held. For example, if it is initiated by a > network wake event. But, for the moment I think that this appoach is good > enough. > > My main comment here is the naming conventions, I prefer using the original > parameters/function names for consistency, similarly to what was done in the > igb driver: > https://github.com/torvalds/linux/commit/ac8c58f5b535d6272324e2b8b4a0454781c9147e Sorry, can you be more specific on what the naming issue is? Do you want me to resubmit this with "__igc_do_resume" renamed to "__igc_resume" and "bool need_rtnl" renamed to "bool rpm" or something else?
diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index eac0f966e0e4..b8111ad9a9a8 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -337,6 +337,8 @@ struct igc_adapter { struct igc_led_classdev *leds; }; +void igc_set_queue_napi(struct igc_adapter *adapter, int q_idx, + struct napi_struct *napi); void igc_up(struct igc_adapter *adapter); void igc_down(struct igc_adapter *adapter); int igc_open(struct net_device *netdev); diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 7964bbedb16c..051a0cdb1143 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -4948,6 +4948,22 @@ static int igc_sw_init(struct igc_adapter *adapter) return 0; } +void igc_set_queue_napi(struct igc_adapter *adapter, int vector, + struct napi_struct *napi) +{ + struct igc_q_vector *q_vector = adapter->q_vector[vector]; + + if (q_vector->rx.ring) + netif_queue_set_napi(adapter->netdev, + q_vector->rx.ring->queue_index, + NETDEV_QUEUE_TYPE_RX, napi); + + if (q_vector->tx.ring) + netif_queue_set_napi(adapter->netdev, + q_vector->tx.ring->queue_index, + NETDEV_QUEUE_TYPE_TX, napi); +} + /** * igc_up - Open the interface and prepare it to handle traffic * @adapter: board private structure @@ -4955,6 +4971,7 @@ static int igc_sw_init(struct igc_adapter *adapter) void igc_up(struct igc_adapter *adapter) { struct igc_hw *hw = &adapter->hw; + struct napi_struct *napi; int i = 0; /* hardware has been reset, we need to reload some things */ @@ -4962,8 +4979,11 @@ void igc_up(struct igc_adapter *adapter) clear_bit(__IGC_DOWN, &adapter->state); - for (i = 0; i < adapter->num_q_vectors; i++) - napi_enable(&adapter->q_vector[i]->napi); + for (i = 0; i < adapter->num_q_vectors; i++) { + napi = &adapter->q_vector[i]->napi; + napi_enable(napi); + igc_set_queue_napi(adapter, i, napi); + } if (adapter->msix_entries) igc_configure_msix(adapter); @@ -5192,6 +5212,7 @@ void igc_down(struct igc_adapter *adapter) for (i = 0; i < adapter->num_q_vectors; i++) { if (adapter->q_vector[i]) { napi_synchronize(&adapter->q_vector[i]->napi); + igc_set_queue_napi(adapter, i, NULL); napi_disable(&adapter->q_vector[i]->napi); } } @@ -6021,6 +6042,7 @@ static int __igc_open(struct net_device *netdev, bool resuming) struct igc_adapter *adapter = netdev_priv(netdev); struct pci_dev *pdev = adapter->pdev; struct igc_hw *hw = &adapter->hw; + struct napi_struct *napi; int err = 0; int i = 0; @@ -6056,8 +6078,11 @@ static int __igc_open(struct net_device *netdev, bool resuming) clear_bit(__IGC_DOWN, &adapter->state); - for (i = 0; i < adapter->num_q_vectors; i++) - napi_enable(&adapter->q_vector[i]->napi); + for (i = 0; i < adapter->num_q_vectors; i++) { + napi = &adapter->q_vector[i]->napi; + napi_enable(napi); + igc_set_queue_napi(adapter, i, napi); + } /* Clear any pending interrupts. */ rd32(IGC_ICR); @@ -7342,7 +7367,7 @@ static void igc_deliver_wake_packet(struct net_device *netdev) netif_rx(skb); } -static int igc_resume(struct device *dev) +static int __igc_do_resume(struct device *dev, bool need_rtnl) { struct pci_dev *pdev = to_pci_dev(dev); struct net_device *netdev = pci_get_drvdata(pdev); @@ -7385,7 +7410,11 @@ static int igc_resume(struct device *dev) wr32(IGC_WUS, ~0); if (netif_running(netdev)) { + if (need_rtnl) + rtnl_lock(); err = __igc_open(netdev, true); + if (need_rtnl) + rtnl_unlock(); if (!err) netif_device_attach(netdev); } @@ -7393,9 +7422,14 @@ static int igc_resume(struct device *dev) return err; } +static int igc_resume(struct device *dev) +{ + return __igc_do_resume(dev, true); +} + static int igc_runtime_resume(struct device *dev) { - return igc_resume(dev); + return __igc_do_resume(dev, false); } static int igc_suspend(struct device *dev) @@ -7440,14 +7474,18 @@ static pci_ers_result_t igc_io_error_detected(struct pci_dev *pdev, struct net_device *netdev = pci_get_drvdata(pdev); struct igc_adapter *adapter = netdev_priv(netdev); + rtnl_lock(); netif_device_detach(netdev); - if (state == pci_channel_io_perm_failure) + if (state == pci_channel_io_perm_failure) { + rtnl_unlock(); return PCI_ERS_RESULT_DISCONNECT; + } if (netif_running(netdev)) igc_down(adapter); pci_disable_device(pdev); + rtnl_unlock(); /* Request a slot reset. */ return PCI_ERS_RESULT_NEED_RESET; diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c index e27af72aada8..4da633430b80 100644 --- a/drivers/net/ethernet/intel/igc/igc_xdp.c +++ b/drivers/net/ethernet/intel/igc/igc_xdp.c @@ -84,6 +84,7 @@ static int igc_xdp_enable_pool(struct igc_adapter *adapter, napi_disable(napi); } + igc_set_queue_napi(adapter, queue_id, NULL); set_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags); set_bit(IGC_RING_FLAG_AF_XDP_ZC, &tx_ring->flags); @@ -133,6 +134,7 @@ static int igc_xdp_disable_pool(struct igc_adapter *adapter, u16 queue_id) xsk_pool_dma_unmap(pool, IGC_RX_DMA_ATTR); clear_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags); clear_bit(IGC_RING_FLAG_AF_XDP_ZC, &tx_ring->flags); + igc_set_queue_napi(adapter, queue_id, napi); if (needs_reset) { napi_enable(napi);
Link queues to NAPI instances via netdev-genl API so that users can query this information with netlink. Handle a few cases in the driver: 1. Link/unlink the NAPIs when XDP is enabled/disabled 2. Handle IGC_FLAG_QUEUE_PAIRS enabled and disabled Example output when IGC_FLAG_QUEUE_PAIRS is enabled: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump queue-get --json='{"ifindex": 2}' [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'}, {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'}, {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'}, {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'}, {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'}, {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'tx'}, {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'}, {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}] Since IGC_FLAG_QUEUE_PAIRS is enabled, you'll note that the same NAPI ID is present for both rx and tx queues at the same index, for example index 0: {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'}, {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'}, To test IGC_FLAG_QUEUE_PAIRS disabled, a test system was booted using the grub command line option "maxcpus=2" to force igc_set_interrupt_capability to disable IGC_FLAG_QUEUE_PAIRS. Example output when IGC_FLAG_QUEUE_PAIRS is disabled: $ lscpu | grep "On-line CPU" On-line CPU(s) list: 0,2 $ ethtool -l enp86s0 | tail -5 Current hardware settings: RX: n/a TX: n/a Other: 1 Combined: 2 $ cat /proc/interrupts | grep enp 144: [...] enp86s0 145: [...] enp86s0-rx-0 146: [...] enp86s0-rx-1 147: [...] enp86s0-tx-0 148: [...] enp86s0-tx-1 1 "other" IRQ, and 2 IRQs for each of RX and Tx, so we expect netlink to report 4 IRQs with unique NAPI IDs: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump napi-get --json='{"ifindex": 2}' [{'id': 8196, 'ifindex': 2, 'irq': 148}, {'id': 8195, 'ifindex': 2, 'irq': 147}, {'id': 8194, 'ifindex': 2, 'irq': 146}, {'id': 8193, 'ifindex': 2, 'irq': 145}] Now we examine which queues these NAPIs are associated with, expecting that since IGC_FLAG_QUEUE_PAIRS is disabled each RX and TX queue will have its own NAPI instance: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump queue-get --json='{"ifindex": 2}' [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'}, {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'}, {'id': 0, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'}, {'id': 1, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}] Signed-off-by: Joe Damato <jdamato@fastly.com> --- v5: - Rename igc_resume to __igc_do_resume and pass in a boolean "need_rtnl" to signal whether or not rtnl should be held before caling __igc_open. Call this new function from igc_runtime_resume and igc_resume passing in false (for igc_runtime_resume) and true (igc_resume), respectively. This is done to avoid reintroducing a bug fixed in commit: 6f31d6b: "igc: Refactor runtime power management flow" where rtnl is held in runtime_resume causing a deadlock. v4: - Add rtnl_lock/rtnl_unlock in two paths: igc_resume and igc_io_error_detected. The code added to the latter is inspired by a similar implementation in ixgbe's ixgbe_io_error_detected. v3: - Replace igc_unset_queue_napi with igc_set_queue_napi(adapater, i, NULL), as suggested by Vinicius Costa Gomes - Simplify implemention of igc_set_queue_napi as suggested by Kurt Kanzenbach, with a tweak to use ring->queue_index v2: - Update commit message to include tests for IGC_FLAG_QUEUE_PAIRS disabled - Refactored code to move napi queue mapping and unmapping to helper functions igc_set_queue_napi and igc_unset_queue_napi - Adjust the code to handle IGC_FLAG_QUEUE_PAIRS disabled - Call helpers to map/unmap queues to NAPIs in igc_up, __igc_open, igc_xdp_enable_pool, and igc_xdp_disable_pool drivers/net/ethernet/intel/igc/igc.h | 2 + drivers/net/ethernet/intel/igc/igc_main.c | 52 ++++++++++++++++++++--- drivers/net/ethernet/intel/igc/igc_xdp.c | 2 + 3 files changed, 49 insertions(+), 7 deletions(-)