Message ID | 171234737780.5075.5717254021446469741.stgit@anambiarhost.jf.intel.com (mailing list archive) |
---|---|
Headers | show |
Series | Configuring NAPI instance for a queue | expand |
On Fri, 05 Apr 2024 13:09:28 -0700 Amritha Nambiar wrote: > $ ./cli.py --spec netdev.yaml --do queue-set --json='{"ifindex": 12, "id": 0, "type": 0, "napi-id": 595}' > {'id': 0, 'ifindex': 12, 'napi-id': 595, 'type': 'rx'} NAPI ID is not stable. What happens if you set the ID, bring the device down and up again? I think we need to make NAPI IDs stable. What happens if you change the channel count? Do we lose the config? We try never to lose explicit user config. I think for simplicity we should store the config in the core / common code. How does the user know whether queue <> NAPI association is based on driver defaults or explicit configuration? I think I mentioned this in earlier discussions but the configuration may need to be detached from the existing objects (for one thing they may not exist at all when the device is down). Last but not least your driver patch implements the start/stop steps of the "queue API" I think we should pull that out into the core. Also the tests now exist - take a look at the sample one in tools/testing/selftests/drivers/net/stats.py Would be great to have all future netdev family extensions accompanied by tests which can run both on real HW and netdevsim.
On 4/9/2024 4:21 PM, Jakub Kicinski wrote: > On Fri, 05 Apr 2024 13:09:28 -0700 Amritha Nambiar wrote: >> $ ./cli.py --spec netdev.yaml --do queue-set --json='{"ifindex": 12, "id": 0, "type": 0, "napi-id": 595}' >> {'id': 0, 'ifindex': 12, 'napi-id': 595, 'type': 'rx'} > > NAPI ID is not stable. What happens if you set the ID, bring the > device down and up again? I think we need to make NAPI IDs stable. > I tried this (device down/up and check NAPIs) on both bnxt and intel/ice. On bnxt: New NAPI IDs are created sequentially once the device is up after turning down. On ice: The NAPI IDs are stable and remains the same once the device is up after turning down. In case of ice, device down/up executes napi_disable/napi_enable. The NAPI IDs are not lost as netif_napi_del is not called at IFF_DOWN. On IFF_DOWN, the IRQs associations with the OS are freed, but the resources allocated for the vectors and hence the NAPIs for the vectors persists (unless unload/reconfig). > What happens if you change the channel count? Do we lose the config? > We try never to lose explicit user config. I think for simplicity > we should store the config in the core / common code. > Yes, we lose the config in case of re-configuring channels. The reconfig path involves freeing the vectors and reallocating based on the new channel config, so, for the NAPIs associated with the vectors, netif_napi_del and netif_napi_add executes creating new NAPI IDs sequentially. Wouldn't losing the explicit user config make sense in this case? By changing the channel count, the user has updated the queue layout, the queue<>vector mappings etc., so I think, the previous configs from set queue<>NAPI should be overwritten with the new config from set-channel. > How does the user know whether queue <> NAPI association is based > on driver defaults or explicit configuration? I am not sure of this. ethtool shows pre-set defaults and current settings, but in this case, it is tricky :( I think I mentioned > this in earlier discussions but the configuration may need to be > detached from the existing objects (for one thing they may not exist > at all when the device is down). > Yes, we did have that discussion about detaching queues from NAPI. But, I am not sure how to accomplish that. Any thoughts on what other possible object can be used for the configuration? WRT ice, when the device is down, the queues are listed and exists as inactive queues, NAPI IDs exists, IRQs associations with the OS are freed. > Last but not least your driver patch implements the start/stop steps > of the "queue API" I think we should pull that out into the core. > Agree, it would be good to have these steps in the core, but I think the challenge is that we would still end up with a lot of code in the driver as well, due to all the hardware-centric bits in it. > Also the tests now exist - take a look at the sample one in > tools/testing/selftests/drivers/net/stats.py > Would be great to have all future netdev family extensions accompanied > by tests which can run both on real HW and netdevsim. Okay, I will write tests for the new extensions here.
On Thu, 11 Apr 2024 15:46:45 -0700 Nambiar, Amritha wrote: > On 4/9/2024 4:21 PM, Jakub Kicinski wrote: > > On Fri, 05 Apr 2024 13:09:28 -0700 Amritha Nambiar wrote: > >> $ ./cli.py --spec netdev.yaml --do queue-set --json='{"ifindex": 12, "id": 0, "type": 0, "napi-id": 595}' > >> {'id': 0, 'ifindex': 12, 'napi-id': 595, 'type': 'rx'} > > > > NAPI ID is not stable. What happens if you set the ID, bring the > > device down and up again? I think we need to make NAPI IDs stable. > > I tried this (device down/up and check NAPIs) on both bnxt and intel/ice. > On bnxt: New NAPI IDs are created sequentially once the device is up > after turning down. > On ice: The NAPI IDs are stable and remains the same once the device is > up after turning down. > > In case of ice, device down/up executes napi_disable/napi_enable. The > NAPI IDs are not lost as netif_napi_del is not called at IFF_DOWN. On > IFF_DOWN, the IRQs associations with the OS are freed, but the resources > allocated for the vectors and hence the NAPIs for the vectors persists > (unless unload/reconfig). SG! So let's just make sure we cover that in tests. > > What happens if you change the channel count? Do we lose the config? > > We try never to lose explicit user config. I think for simplicity > > we should store the config in the core / common code. > > Yes, we lose the config in case of re-configuring channels. The reconfig > path involves freeing the vectors and reallocating based on the new > channel config, so, for the NAPIs associated with the vectors, > netif_napi_del and netif_napi_add executes creating new NAPI IDs > sequentially. > > Wouldn't losing the explicit user config make sense in this case? By > changing the channel count, the user has updated the queue layout, the > queue<>vector mappings etc., so I think, the previous configs from set > queue<>NAPI should be overwritten with the new config from set-channel. We do prevent indirection table from being reset on channel count change. I think same logic applies here.. > > How does the user know whether queue <> NAPI association is based > > on driver defaults or explicit configuration? > > I am not sure of this. ethtool shows pre-set defaults and current > settings, but in this case, it is tricky :( Can you say more about the use case for moving the queues around? If you just want to have fewer NAPI vectors and more queues, but don't care about exact mapping - we could probably come up with a simpler API, no? Are the queues stack queues or also AF_XDP? > > I think I mentioned > > this in earlier discussions but the configuration may need to be > > detached from the existing objects (for one thing they may not exist > > at all when the device is down). > > Yes, we did have that discussion about detaching queues from NAPI. But, > I am not sure how to accomplish that. Any thoughts on what other > possible object can be used for the configuration? We could stick to the queue as the object perhaps. The "target NAPI" would just be part of the config passed to the alloc/start callbacks. > WRT ice, when the device is down, the queues are listed and exists as > inactive queues, NAPI IDs exists, IRQs associations with the OS are freed. > > > Last but not least your driver patch implements the start/stop steps > > of the "queue API" I think we should pull that out into the core. > > Agree, it would be good to have these steps in the core, but I think the > challenge is that we would still end up with a lot of code in the driver > as well, due to all the hardware-centric bits in it. For one feature I think adding code in the core is not beneficial. But we have multiple adjacent needs, so when we add up your work, zero copy, page pool config, maybe queue alloc.. hopefully the code in the core will be net positive. > > Also the tests now exist - take a look at the sample one in > > tools/testing/selftests/drivers/net/stats.py > > Would be great to have all future netdev family extensions accompanied > > by tests which can run both on real HW and netdevsim. > > Okay, I will write tests for the new extensions here.
On 4/11/2024 6:47 PM, Jakub Kicinski wrote: > On Thu, 11 Apr 2024 15:46:45 -0700 Nambiar, Amritha wrote: >> On 4/9/2024 4:21 PM, Jakub Kicinski wrote: >>> On Fri, 05 Apr 2024 13:09:28 -0700 Amritha Nambiar wrote: >>>> $ ./cli.py --spec netdev.yaml --do queue-set --json='{"ifindex": 12, "id": 0, "type": 0, "napi-id": 595}' >>>> {'id': 0, 'ifindex': 12, 'napi-id': 595, 'type': 'rx'} >>> >>> NAPI ID is not stable. What happens if you set the ID, bring the >>> device down and up again? I think we need to make NAPI IDs stable. >> >> I tried this (device down/up and check NAPIs) on both bnxt and intel/ice. >> On bnxt: New NAPI IDs are created sequentially once the device is up >> after turning down. >> On ice: The NAPI IDs are stable and remains the same once the device is >> up after turning down. >> >> In case of ice, device down/up executes napi_disable/napi_enable. The >> NAPI IDs are not lost as netif_napi_del is not called at IFF_DOWN. On >> IFF_DOWN, the IRQs associations with the OS are freed, but the resources >> allocated for the vectors and hence the NAPIs for the vectors persists >> (unless unload/reconfig). > > SG! So let's just make sure we cover that in tests. > >>> What happens if you change the channel count? Do we lose the config? >>> We try never to lose explicit user config. I think for simplicity >>> we should store the config in the core / common code. >> >> Yes, we lose the config in case of re-configuring channels. The reconfig >> path involves freeing the vectors and reallocating based on the new >> channel config, so, for the NAPIs associated with the vectors, >> netif_napi_del and netif_napi_add executes creating new NAPI IDs >> sequentially. >> >> Wouldn't losing the explicit user config make sense in this case? By >> changing the channel count, the user has updated the queue layout, the >> queue<>vector mappings etc., so I think, the previous configs from set >> queue<>NAPI should be overwritten with the new config from set-channel. > > We do prevent indirection table from being reset on channel count > change. I think same logic applies here.. > Okay. I tried this on bnxt (this may be outside scope and secondary, but hoping all the additional information helps). It looks like bnxt differentiates if the indirection table was based on driver defaults vs user configuration. If the indirection table was from driver defaults, then changing channel count to fewer queues is allowed. If it was based on explicit user configuration, changing channel count to fewer queues is not allowed as the indirection table might then point to inactive queues. So, the rss user configuration is preserved by blocking new channel configurations that do not align. So applying the same logic here would mean, changing the channel count to queues < 'default queue for the last user configured NAPI ID' would have to be prevented. This becomes difficult to track unless pre-set default queue <> NAPI configs are maintained. >>> How does the user know whether queue <> NAPI association is based >>> on driver defaults or explicit configuration? >> >> I am not sure of this. ethtool shows pre-set defaults and current >> settings, but in this case, it is tricky :( > > Can you say more about the use case for moving the queues around? > If you just want to have fewer NAPI vectors and more queues, but > don't care about exact mapping - we could probably come up with > a simpler API, no? Are the queues stack queues or also AF_XDP? > I'll try to explain. The goal is to have fewer NAPI pollers. The number of NAPI pollers is the same as the number of active NAPIs (kthread per NAPI). It is possible to limit the number of pollers by mapping multiples queues on an interrupt vector (fewer vectors, more queues) implicitly in the driver. But, we are looking for a more granular approach, in our case, the queues are grouped into queue-groups/rss-contexts. We would like to reduce the number of pollers within certain selected queue-groups/rss-contexts (not all the queue-groups), hence need the configurability. This would benefit our hyper-threading use case, where a single physical core can be used for both network and application processing. If the NAPI to queue association is known, we can pin the NAPI thread to the logical core and the application thread to the corresponding sibling logical core. The queues are stack queues, not AF_XDP. >>> I think I mentioned >>> this in earlier discussions but the configuration may need to be >>> detached from the existing objects (for one thing they may not exist >>> at all when the device is down). >> >> Yes, we did have that discussion about detaching queues from NAPI. But, >> I am not sure how to accomplish that. Any thoughts on what other >> possible object can be used for the configuration? > > We could stick to the queue as the object perhaps. The "target NAPI" > would just be part of the config passed to the alloc/start callbacks. > Okay. >> WRT ice, when the device is down, the queues are listed and exists as >> inactive queues, NAPI IDs exists, IRQs associations with the OS are freed. >> >>> Last but not least your driver patch implements the start/stop steps >>> of the "queue API" I think we should pull that out into the core. >> >> Agree, it would be good to have these steps in the core, but I think the >> challenge is that we would still end up with a lot of code in the driver >> as well, due to all the hardware-centric bits in it. > > For one feature I think adding code in the core is not beneficial. > But we have multiple adjacent needs, so when we add up your work, > zero copy, page pool config, maybe queue alloc.. hopefully the code > in the core will be net positive. > >>> Also the tests now exist - take a look at the sample one in >>> tools/testing/selftests/drivers/net/stats.py >>> Would be great to have all future netdev family extensions accompanied >>> by tests which can run both on real HW and netdevsim. >> >> Okay, I will write tests for the new extensions here.
On Thu, 18 Apr 2024 14:23:03 -0700 Nambiar, Amritha wrote: > >> I am not sure of this. ethtool shows pre-set defaults and current > >> settings, but in this case, it is tricky :( > > > > Can you say more about the use case for moving the queues around? > > If you just want to have fewer NAPI vectors and more queues, but > > don't care about exact mapping - we could probably come up with > > a simpler API, no? Are the queues stack queues or also AF_XDP? > > I'll try to explain. The goal is to have fewer NAPI pollers. The number > of NAPI pollers is the same as the number of active NAPIs (kthread per > NAPI). It is possible to limit the number of pollers by mapping > multiples queues on an interrupt vector (fewer vectors, more queues) > implicitly in the driver. But, we are looking for a more granular > approach, in our case, the queues are grouped into > queue-groups/rss-contexts. We would like to reduce the number of pollers > within certain selected queue-groups/rss-contexts (not all the > queue-groups), hence need the configurability. > This would benefit our hyper-threading use case, where a single physical > core can be used for both network and application processing. If the > NAPI to queue association is known, we can pin the NAPI thread to the > logical core and the application thread to the corresponding sibling > logical core. Could you provide a more detailed example of a desired configuration? I'm not sure I'm getting the point. What's the point of having multiple queues if they end up in the same NAPI?