diff mbox series

[net-next,v2,4/8] net: napi: add CPU affinity to napi->config

Message ID 20241218165843.744647-5-ahmed.zaki@intel.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series net: napi: add CPU affinity to napi->config | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 53 this patch: 49
netdev/build_tools success Errors and warnings before: 0 (+23) this patch: 0 (+23)
netdev/cc_maintainers warning 1 maintainers not CCed: horms@kernel.org
netdev/build_clang fail Errors and warnings before: 5875 this patch: 6597
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 3902 this patch: 4079
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 135 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 95 this patch: 95
netdev/source_inline success Was 0 now: 0

Commit Message

Ahmed Zaki Dec. 18, 2024, 4:58 p.m. UTC
A common task for most drivers is to remember the user-set CPU affinity
to its IRQs. On each netdev reset, the driver should re-assign the
user's setting to the IRQs.

Add CPU affinity mask to napi->config. To delegate the CPU affinity
management to the core, drivers must:
 1 - add a persistent napi config:     netif_napi_add_config()
 2 - bind an IRQ to the napi instance: netif_napi_set_irq() with the new
     flag NAPIF_IRQ_AFFINITY

the core will then make sure to use re-assign affinity to the napi's
IRQ.

The default mask set to all IRQs is all online CPUs.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
 include/linux/netdevice.h |  5 +++
 net/core/dev.c            | 66 +++++++++++++++++++++++++++++++++++++--
 2 files changed, 69 insertions(+), 2 deletions(-)

Comments

kernel test robot Dec. 18, 2024, 8:16 p.m. UTC | #1
Hi Ahmed,

kernel test robot noticed the following build warnings:

[auto build test WARNING on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Ahmed-Zaki/net-napi-add-irq_flags-to-napi-struct/20241219-010125
base:   net-next/main
patch link:    https://lore.kernel.org/r/20241218165843.744647-5-ahmed.zaki%40intel.com
patch subject: [Intel-wired-lan] [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
config: arm-randconfig-001-20241219 (https://download.01.org/0day-ci/archive/20241219/202412190421.N2xtn20H-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241219/202412190421.N2xtn20H-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202412190421.N2xtn20H-lkp@intel.com/

All warnings (new ones prefixed by >>):

   net/core/dev.c:6716:6: warning: unused variable 'rc' [-Wunused-variable]
    6716 |         int rc;
         |             ^~
   net/core/dev.c:6746:7: warning: unused variable 'rc' [-Wunused-variable]
    6746 |         int  rc;
         |              ^~
>> net/core/dev.c:6766:7: warning: variable 'glue_created' is uninitialized when used here [-Wuninitialized]
    6766 |         if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
         |              ^~~~~~~~~~~~
   net/core/dev.c:6745:19: note: initialize the variable 'glue_created' to silence this warning
    6745 |         bool glue_created;
         |                          ^
         |                           = 0
   3 warnings generated.


vim +/glue_created +6766 net/core/dev.c

  6765	
> 6766		if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
  6767			glue = kzalloc(sizeof(*glue), GFP_KERNEL);
  6768			if (!glue)
  6769				return;
  6770			glue->notify.notify = netif_irq_cpu_rmap_notify;
  6771			glue->notify.release = netif_napi_affinity_release;
  6772			glue->data = napi;
  6773			glue->rmap = NULL;
  6774			napi->irq_flags |= NAPIF_IRQ_NORMAP;
  6775		}
  6776	}
  6777	EXPORT_SYMBOL(netif_napi_set_irq);
  6778
kernel test robot Dec. 18, 2024, 8:27 p.m. UTC | #2
Hi Ahmed,

kernel test robot noticed the following build warnings:

[auto build test WARNING on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Ahmed-Zaki/net-napi-add-irq_flags-to-napi-struct/20241219-010125
base:   net-next/main
patch link:    https://lore.kernel.org/r/20241218165843.744647-5-ahmed.zaki%40intel.com
patch subject: [Intel-wired-lan] [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
config: riscv-randconfig-001-20241219 (https://download.01.org/0day-ci/archive/20241219/202412190454.nwvp3hU2-lkp@intel.com/config)
compiler: clang version 16.0.6 (https://github.com/llvm/llvm-project 7cbf1a2591520c2491aa35339f227775f4d3adf6)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241219/202412190454.nwvp3hU2-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202412190454.nwvp3hU2-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> net/core/dev.c:6755:7: warning: variable 'glue_created' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
                   if (rc) {
                       ^~
   net/core/dev.c:6766:7: note: uninitialized use occurs here
           if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
                ^~~~~~~~~~~~
   net/core/dev.c:6755:3: note: remove the 'if' if its condition is always false
                   if (rc) {
                   ^~~~~~~~~
>> net/core/dev.c:6752:6: warning: variable 'glue_created' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
           if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   net/core/dev.c:6766:7: note: uninitialized use occurs here
           if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
                ^~~~~~~~~~~~
   net/core/dev.c:6752:2: note: remove the 'if' if its condition is always true
           if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> net/core/dev.c:6752:6: warning: variable 'glue_created' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
           if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
               ^~~~~~~~~~~~~~~~~~~~~~
   net/core/dev.c:6766:7: note: uninitialized use occurs here
           if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
                ^~~~~~~~~~~~
   net/core/dev.c:6752:6: note: remove the '&&' if its condition is always true
           if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
               ^~~~~~~~~~~~~~~~~~~~~~~~~
   net/core/dev.c:6745:19: note: initialize the variable 'glue_created' to silence this warning
           bool glue_created;
                            ^
                             = 0
   net/core/dev.c:4176:1: warning: unused function 'sch_handle_ingress' [-Wunused-function]
   sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
   ^
   net/core/dev.c:4183:1: warning: unused function 'sch_handle_egress' [-Wunused-function]
   sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
   ^
   net/core/dev.c:5440:19: warning: unused function 'nf_ingress' [-Wunused-function]
   static inline int nf_ingress(struct sk_buff *skb, struct packet_type **pt_prev,
                     ^
   6 warnings generated.

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for FB_IOMEM_HELPERS
   Depends on [n]: HAS_IOMEM [=y] && FB_CORE [=n]
   Selected by [m]:
   - DRM_XE_DISPLAY [=y] && HAS_IOMEM [=y] && DRM [=m] && DRM_XE [=m] && DRM_XE [=m]=m [=m] && HAS_IOPORT [=y]


vim +6755 net/core/dev.c

8e5191fb19bffce Ahmed Zaki 2024-12-18  6741  
001dc6db21f4bfe Ahmed Zaki 2024-12-18  6742  void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long flags)
001dc6db21f4bfe Ahmed Zaki 2024-12-18  6743  {
8e5191fb19bffce Ahmed Zaki 2024-12-18  6744  	struct irq_glue *glue = NULL;
8e5191fb19bffce Ahmed Zaki 2024-12-18  6745  	bool glue_created;
a274d2669a73ef7 Ahmed Zaki 2024-12-18  6746  	int  rc;
a274d2669a73ef7 Ahmed Zaki 2024-12-18  6747  
001dc6db21f4bfe Ahmed Zaki 2024-12-18  6748  	napi->irq = irq;
001dc6db21f4bfe Ahmed Zaki 2024-12-18  6749  	napi->irq_flags = flags;
a274d2669a73ef7 Ahmed Zaki 2024-12-18  6750  
a274d2669a73ef7 Ahmed Zaki 2024-12-18  6751  #ifdef CONFIG_RFS_ACCEL
a274d2669a73ef7 Ahmed Zaki 2024-12-18 @6752  	if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
8e5191fb19bffce Ahmed Zaki 2024-12-18  6753  		rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq, napi,
8e5191fb19bffce Ahmed Zaki 2024-12-18  6754  				      netif_irq_cpu_rmap_notify);
a274d2669a73ef7 Ahmed Zaki 2024-12-18 @6755  		if (rc) {
a274d2669a73ef7 Ahmed Zaki 2024-12-18  6756  			netdev_warn(napi->dev, "Unable to update ARFS map (%d).\n",
a274d2669a73ef7 Ahmed Zaki 2024-12-18  6757  				    rc);
a274d2669a73ef7 Ahmed Zaki 2024-12-18  6758  			free_irq_cpu_rmap(napi->dev->rx_cpu_rmap);
a274d2669a73ef7 Ahmed Zaki 2024-12-18  6759  			napi->dev->rx_cpu_rmap = NULL;
8e5191fb19bffce Ahmed Zaki 2024-12-18  6760  		} else {
8e5191fb19bffce Ahmed Zaki 2024-12-18  6761  			glue_created = true;
a274d2669a73ef7 Ahmed Zaki 2024-12-18  6762  		}
a274d2669a73ef7 Ahmed Zaki 2024-12-18  6763  	}
a274d2669a73ef7 Ahmed Zaki 2024-12-18  6764  #endif
8e5191fb19bffce Ahmed Zaki 2024-12-18  6765  
8e5191fb19bffce Ahmed Zaki 2024-12-18  6766  	if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
8e5191fb19bffce Ahmed Zaki 2024-12-18  6767  		glue = kzalloc(sizeof(*glue), GFP_KERNEL);
8e5191fb19bffce Ahmed Zaki 2024-12-18  6768  		if (!glue)
8e5191fb19bffce Ahmed Zaki 2024-12-18  6769  			return;
8e5191fb19bffce Ahmed Zaki 2024-12-18  6770  		glue->notify.notify = netif_irq_cpu_rmap_notify;
8e5191fb19bffce Ahmed Zaki 2024-12-18  6771  		glue->notify.release = netif_napi_affinity_release;
8e5191fb19bffce Ahmed Zaki 2024-12-18  6772  		glue->data = napi;
8e5191fb19bffce Ahmed Zaki 2024-12-18  6773  		glue->rmap = NULL;
8e5191fb19bffce Ahmed Zaki 2024-12-18  6774  		napi->irq_flags |= NAPIF_IRQ_NORMAP;
8e5191fb19bffce Ahmed Zaki 2024-12-18  6775  	}
001dc6db21f4bfe Ahmed Zaki 2024-12-18  6776  }
001dc6db21f4bfe Ahmed Zaki 2024-12-18  6777  EXPORT_SYMBOL(netif_napi_set_irq);
001dc6db21f4bfe Ahmed Zaki 2024-12-18  6778
Jakub Kicinski Dec. 20, 2024, 3:42 a.m. UTC | #3
On Wed, 18 Dec 2024 09:58:39 -0700 Ahmed Zaki wrote:
> +	if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
> +		glue = kzalloc(sizeof(*glue), GFP_KERNEL);
> +		if (!glue)
> +			return;
> +		glue->notify.notify = netif_irq_cpu_rmap_notify;
> +		glue->notify.release = netif_napi_affinity_release;
> +		glue->data = napi;
> +		glue->rmap = NULL;
> +		napi->irq_flags |= NAPIF_IRQ_NORMAP;

Why allocate the glue? is it not possible to add the fields:

	struct irq_affinity_notify notify;
	u16 index;

to struct napi_struct ?
Ahmed Zaki Dec. 20, 2024, 2:51 p.m. UTC | #4
On 2024-12-19 8:42 p.m., Jakub Kicinski wrote:
> On Wed, 18 Dec 2024 09:58:39 -0700 Ahmed Zaki wrote:
>> +	if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
>> +		glue = kzalloc(sizeof(*glue), GFP_KERNEL);
>> +		if (!glue)
>> +			return;
>> +		glue->notify.notify = netif_irq_cpu_rmap_notify;
>> +		glue->notify.release = netif_napi_affinity_release;
>> +		glue->data = napi;
>> +		glue->rmap = NULL;
>> +		napi->irq_flags |= NAPIF_IRQ_NORMAP;
> 
> Why allocate the glue? is it not possible to add the fields:
> 
> 	struct irq_affinity_notify notify;
> 	u16 index;
> 
> to struct napi_struct ?

In the first branch of "if", the cb function netif_irq_cpu_rmap_notify() 
is also passed to irq_cpu_rmap_add() where the irq notifier is embedded 
in "struct irq_glue".

I think this cannot be changed as long as some drivers are directly 
calling irq_cpu_rmap_add() instead of the proposed API.
Jakub Kicinski Dec. 20, 2024, 5:23 p.m. UTC | #5
On Fri, 20 Dec 2024 07:51:09 -0700 Ahmed Zaki wrote:
> On 2024-12-19 8:42 p.m., Jakub Kicinski wrote:
> > On Wed, 18 Dec 2024 09:58:39 -0700 Ahmed Zaki wrote:  
> >> +	if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
> >> +		glue = kzalloc(sizeof(*glue), GFP_KERNEL);
> >> +		if (!glue)
> >> +			return;
> >> +		glue->notify.notify = netif_irq_cpu_rmap_notify;
> >> +		glue->notify.release = netif_napi_affinity_release;
> >> +		glue->data = napi;
> >> +		glue->rmap = NULL;
> >> +		napi->irq_flags |= NAPIF_IRQ_NORMAP;  
> > 
> > Why allocate the glue? is it not possible to add the fields:
> > 
> > 	struct irq_affinity_notify notify;
> > 	u16 index;
> > 
> > to struct napi_struct ?  
> 
> In the first branch of "if", the cb function netif_irq_cpu_rmap_notify() 
> is also passed to irq_cpu_rmap_add() where the irq notifier is embedded 
> in "struct irq_glue".

I don't understand what you're trying to say, could you rephrase?

> I think this cannot be changed as long as some drivers are directly 
> calling irq_cpu_rmap_add() instead of the proposed API.

Drivers which are not converted shouldn't matter if we have our own
notifier and call cpu_rmap_update() directly, no?

Drivers which are converted should not call irq_cpu_rmap_add().
Ahmed Zaki Dec. 20, 2024, 7:15 p.m. UTC | #6
On 2024-12-20 10:23 a.m., Jakub Kicinski wrote:
> On Fri, 20 Dec 2024 07:51:09 -0700 Ahmed Zaki wrote:
>> On 2024-12-19 8:42 p.m., Jakub Kicinski wrote:
>>> On Wed, 18 Dec 2024 09:58:39 -0700 Ahmed Zaki wrote:
>>>> +	if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
>>>> +		glue = kzalloc(sizeof(*glue), GFP_KERNEL);
>>>> +		if (!glue)
>>>> +			return;
>>>> +		glue->notify.notify = netif_irq_cpu_rmap_notify;
>>>> +		glue->notify.release = netif_napi_affinity_release;
>>>> +		glue->data = napi;
>>>> +		glue->rmap = NULL;
>>>> +		napi->irq_flags |= NAPIF_IRQ_NORMAP;
>>>
>>> Why allocate the glue? is it not possible to add the fields:
>>>
>>> 	struct irq_affinity_notify notify;
>>> 	u16 index;
>>>
>>> to struct napi_struct ?
>>
>> In the first branch of "if", the cb function netif_irq_cpu_rmap_notify()
>> is also passed to irq_cpu_rmap_add() where the irq notifier is embedded
>> in "struct irq_glue".
> 
> I don't understand what you're trying to say, could you rephrase?

Sure. After this patch, we have (simplified):

void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long 
flags)
  {
	struct irq_glue *glue = NULL;
  	int  rc;

  	napi->irq = irq;

  #ifdef CONFIG_RFS_ACCEL
  	if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
		rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq, napi,
				      netif_irq_cpu_rmap_notify);
		.
		.
		.
  	}
  #endif

	if (flags & NAPIF_IRQ_AFFINITY) {
		glue = kzalloc(sizeof(*glue), GFP_KERNEL);
		if (!glue)
			return;
		glue->notify.notify = netif_irq_cpu_rmap_notify;
		glue->notify.release = netif_napi_affinity_release;
		.
		.
	}
  }


Both branches assign the new cb function "netif_irq_cpu_rmap_notify()" 
as the new IRQ notifier, but the first branch calls irq_cpu_rmap_add() 
where the notifier is embedded in "struct irq_glue". So the cb function 
needs to assume the notifier is inside irq_glue, so the second "if" 
branch needs to do the same.


> 
>> I think this cannot be changed as long as some drivers are directly
>> calling irq_cpu_rmap_add() instead of the proposed API.
> 
> Drivers which are not converted shouldn't matter if we have our own
> notifier and call cpu_rmap_update() directly, no?
> 

Only dependency is that irq_cpu_rmap_add() puts notifier inside irq_glue.

> Drivers which are converted should not call irq_cpu_rmap_add().

Correct, they don't.
Jakub Kicinski Dec. 20, 2024, 7:37 p.m. UTC | #7
On Fri, 20 Dec 2024 12:15:33 -0700 Ahmed Zaki wrote:
> > I don't understand what you're trying to say, could you rephrase?  
> 
> Sure. After this patch, we have (simplified):
> 
> void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long 
> flags)
>   {
> 	struct irq_glue *glue = NULL;
>   	int  rc;
> 
>   	napi->irq = irq;
> 
>   #ifdef CONFIG_RFS_ACCEL
>   	if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
> 		rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq, napi,
> 				      netif_irq_cpu_rmap_notify);
> 		.
> 		.
> 		.
>   	}
>   #endif
> 
> 	if (flags & NAPIF_IRQ_AFFINITY) {
> 		glue = kzalloc(sizeof(*glue), GFP_KERNEL);
> 		if (!glue)
> 			return;
> 		glue->notify.notify = netif_irq_cpu_rmap_notify;
> 		glue->notify.release = netif_napi_affinity_release;
> 		.
> 		.
> 	}
>   }
> 
> 
> Both branches assign the new cb function "netif_irq_cpu_rmap_notify()" 
> as the new IRQ notifier, but the first branch calls irq_cpu_rmap_add() 
> where the notifier is embedded in "struct irq_glue". So the cb function 
> needs to assume the notifier is inside irq_glue, so the second "if" 
> branch needs to do the same.

First off, I'm still a bit confused why you think the flags should be
per NAPI call and not set at init time, once.
Perhaps rename netif_enable_cpu_rmap() suggested earlier to something
more generic (netif_enable_irq_tracking()?) and pass the flags there?
Or is there a driver which wants to vary the flags per NAPI instance?

Then you can probably register a single unified handler, and inside
that handler check if the device wanted to have rmap or just affinity?
Ahmed Zaki Dec. 20, 2024, 8:14 p.m. UTC | #8
On 2024-12-20 12:37 p.m., Jakub Kicinski wrote:
> On Fri, 20 Dec 2024 12:15:33 -0700 Ahmed Zaki wrote:
>>> I don't understand what you're trying to say, could you rephrase?
>>
>> Sure. After this patch, we have (simplified):
>>
>> void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long
>> flags)
>>    {
>> 	struct irq_glue *glue = NULL;
>>    	int  rc;
>>
>>    	napi->irq = irq;
>>
>>    #ifdef CONFIG_RFS_ACCEL
>>    	if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
>> 		rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq, napi,
>> 				      netif_irq_cpu_rmap_notify);
>> 		.
>> 		.
>> 		.
>>    	}
>>    #endif
>>
>> 	if (flags & NAPIF_IRQ_AFFINITY) {
>> 		glue = kzalloc(sizeof(*glue), GFP_KERNEL);
>> 		if (!glue)
>> 			return;
>> 		glue->notify.notify = netif_irq_cpu_rmap_notify;
>> 		glue->notify.release = netif_napi_affinity_release;
>> 		.
>> 		.
>> 	}
>>    }
>>
>>
>> Both branches assign the new cb function "netif_irq_cpu_rmap_notify()"
>> as the new IRQ notifier, but the first branch calls irq_cpu_rmap_add()
>> where the notifier is embedded in "struct irq_glue". So the cb function
>> needs to assume the notifier is inside irq_glue, so the second "if"
>> branch needs to do the same.
> 
> First off, I'm still a bit confused why you think the flags should be
> per NAPI call and not set at init time, once.
> Perhaps rename netif_enable_cpu_rmap() suggested earlier to something
> more generic (netif_enable_irq_tracking()?) and pass the flags there?
> Or is there a driver which wants to vary the flags per NAPI instance?
> 

set_irq() just seemed like natural choice since it is already called for 
each IRQ. I was also trying to avoid adding a new function. But sure I 
can do that and move the flags to netdev.

> Then you can probably register a single unified handler, and inside
> that handler check if the device wanted to have rmap or just affinity?

This is what is in this patch already, all drivers following new 
approach will have netif_irq_cpu_rmap_notify() as their IRQ notifier.

IIUC, your goal is to have the notifier inside napi, not irq_glue. For 
this, we'll have to have our own version of irq_cpu_rmap_add() (for the 
above reason).

sounds OK?
Jakub Kicinski Dec. 20, 2024, 8:51 p.m. UTC | #9
On Fri, 20 Dec 2024 13:14:48 -0700 Ahmed Zaki wrote:
> > Then you can probably register a single unified handler, and inside
> > that handler check if the device wanted to have rmap or just affinity?  
> 
> This is what is in this patch already, all drivers following new 
> approach will have netif_irq_cpu_rmap_notify() as their IRQ notifier.
> 
> IIUC, your goal is to have the notifier inside napi, not irq_glue. For 
> this, we'll have to have our own version of irq_cpu_rmap_add() (for the 
> above reason).
> 
> sounds OK?

Yes.
diff mbox series

Patch

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0df419052434..4fa047fad8fb 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -351,6 +351,7 @@  struct napi_config {
 	u64 gro_flush_timeout;
 	u64 irq_suspend_timeout;
 	u32 defer_hard_irqs;
+	cpumask_t affinity_mask;
 	unsigned int napi_id;
 };
 
@@ -358,12 +359,16 @@  enum {
 #ifdef CONFIG_RFS_ACCEL
 	NAPI_IRQ_ARFS_RMAP,		/* Core handles RMAP updates */
 #endif
+	NAPI_IRQ_AFFINITY,		/* Core manages IRQ affinity */
+	NAPI_IRQ_NORMAP			/* Set by core (internal) */
 };
 
 enum {
 #ifdef CONFIG_RFS_ACCEL
 	NAPIF_IRQ_ARFS_RMAP		= BIT(NAPI_IRQ_ARFS_RMAP),
 #endif
+	NAPIF_IRQ_AFFINITY		= BIT(NAPI_IRQ_AFFINITY),
+	NAPIF_IRQ_NORMAP		= BIT(NAPI_IRQ_NORMAP),
 };
 
 /*
diff --git a/net/core/dev.c b/net/core/dev.c
index 7c3abff48aea..84745cea03a7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6705,8 +6705,44 @@  void netif_queue_set_napi(struct net_device *dev, unsigned int queue_index,
 }
 EXPORT_SYMBOL(netif_queue_set_napi);
 
+static void
+netif_irq_cpu_rmap_notify(struct irq_affinity_notify *notify,
+			  const cpumask_t *mask)
+{
+	struct irq_glue *glue =
+		container_of(notify, struct irq_glue, notify);
+	struct napi_struct *napi = glue->data;
+	unsigned int flags;
+	int rc;
+
+	flags = napi->irq_flags;
+
+	if (napi->config && flags & NAPIF_IRQ_AFFINITY)
+		cpumask_copy(&napi->config->affinity_mask, mask);
+
+#ifdef CONFIG_RFS_ACCEL
+	if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
+		rc = cpu_rmap_update(glue->rmap, glue->index, mask);
+		if (rc)
+			pr_warn("%s: update failed: %d\n",
+				__func__, rc);
+	}
+#endif
+}
+
+static void
+netif_napi_affinity_release(struct kref __always_unused *ref)
+{
+	struct irq_glue *glue =
+		container_of(ref, struct irq_glue, notify.kref);
+
+	kfree(glue);
+}
+
 void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long flags)
 {
+	struct irq_glue *glue = NULL;
+	bool glue_created;
 	int  rc;
 
 	napi->irq = irq;
@@ -6714,15 +6750,29 @@  void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long flags)
 
 #ifdef CONFIG_RFS_ACCEL
 	if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
-		rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq);
+		rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq, napi,
+				      netif_irq_cpu_rmap_notify);
 		if (rc) {
 			netdev_warn(napi->dev, "Unable to update ARFS map (%d).\n",
 				    rc);
 			free_irq_cpu_rmap(napi->dev->rx_cpu_rmap);
 			napi->dev->rx_cpu_rmap = NULL;
+		} else {
+			glue_created = true;
 		}
 	}
 #endif
+
+	if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
+		glue = kzalloc(sizeof(*glue), GFP_KERNEL);
+		if (!glue)
+			return;
+		glue->notify.notify = netif_irq_cpu_rmap_notify;
+		glue->notify.release = netif_napi_affinity_release;
+		glue->data = napi;
+		glue->rmap = NULL;
+		napi->irq_flags |= NAPIF_IRQ_NORMAP;
+	}
 }
 EXPORT_SYMBOL(netif_napi_set_irq);
 
@@ -6731,6 +6781,10 @@  static void napi_restore_config(struct napi_struct *n)
 	n->defer_hard_irqs = n->config->defer_hard_irqs;
 	n->gro_flush_timeout = n->config->gro_flush_timeout;
 	n->irq_suspend_timeout = n->config->irq_suspend_timeout;
+
+	if (n->irq > 0 && n->irq_flags & NAPIF_IRQ_AFFINITY)
+		irq_set_affinity(n->irq, &n->config->affinity_mask);
+
 	/* a NAPI ID might be stored in the config, if so use it. if not, use
 	 * napi_hash_add to generate one for us. It will be saved to the config
 	 * in napi_disable.
@@ -6747,6 +6801,11 @@  static void napi_save_config(struct napi_struct *n)
 	n->config->gro_flush_timeout = n->gro_flush_timeout;
 	n->config->irq_suspend_timeout = n->irq_suspend_timeout;
 	n->config->napi_id = n->napi_id;
+
+	if (n->irq > 0 &&
+	    n->irq_flags & (NAPIF_IRQ_AFFINITY | NAPIF_IRQ_NORMAP))
+		irq_set_affinity_notifier(n->irq, NULL);
+
 	napi_hash_del(n);
 }
 
@@ -11211,7 +11270,7 @@  struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 {
 	struct net_device *dev;
 	size_t napi_config_sz;
-	unsigned int maxqs;
+	unsigned int maxqs, i;
 
 	BUG_ON(strlen(name) >= sizeof(dev->name));
 
@@ -11307,6 +11366,9 @@  struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	dev->napi_config = kvzalloc(napi_config_sz, GFP_KERNEL_ACCOUNT);
 	if (!dev->napi_config)
 		goto free_all;
+	for (i = 0; i < maxqs; i++)
+		cpumask_copy(&dev->napi_config[i].affinity_mask,
+			     cpu_online_mask);
 
 	strscpy(dev->name, name);
 	dev->name_assign_type = name_assign_type;