RDMA/hfi: add a judgment on the availability of cpumask

Message ID	20230404030525.24020-1-xiaolinkui@126.com (mailing list archive)
State	Changes Requested
Headers	show Return-Path: <linux-rdma-owner@vger.kernel.org> From: xiaolinkui <xiaolinkui@126.com> To: dennis.dalessandro@cornelisnetworks.com, jgg@ziepe.ca, leon@kernel.org Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, Linkui Xiao <xiaolinkui@kylinos.cn> Subject: [PATCH] RDMA/hfi: add a judgment on the availability of cpumask Date: Tue, 4 Apr 2023 11:05:25 +0800 Message-Id: <20230404030525.24020-1-xiaolinkui@126.com> Precedence: bulk
Series	RDMA/hfi: add a judgment on the availability of cpumask \| expand RDMA/hfi: add a judgment on the availability of cpumask

xiaolinkui April 4, 2023, 3:05 a.m. UTC

From: Linkui Xiao <xiaolinkui@kylinos.cn>

When CONFIG_CPUMASK_OFFSTACK is n, cpumask may fail to allocate, cpumask may
be NULL, and performing a bitmap operation on cpumask may cause problems at
this time.

Of course, this is a unlikely event.

Signed-off-by: Linkui Xiao <xiaolinkui@kylinos.cn>
---
 drivers/infiniband/hw/hfi1/affinity.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Leon Romanovsky April 4, 2023, 6:05 a.m. UTC | #1

On Tue, Apr 04, 2023 at 11:05:25AM +0800, xiaolinkui wrote:
> From: Linkui Xiao <xiaolinkui@kylinos.cn>
> 
> When CONFIG_CPUMASK_OFFSTACK is n, cpumask may fail to allocate, cpumask may
> be NULL, and performing a bitmap operation on cpumask may cause problems at
> this time.
> 
> Of course, this is a unlikely event.
> 
> Signed-off-by: Linkui Xiao <xiaolinkui@kylinos.cn>
> ---
>  drivers/infiniband/hw/hfi1/affinity.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
> index 77ee77d4000f..3caa861f4d1d 100644
> --- a/drivers/infiniband/hw/hfi1/affinity.c
> +++ b/drivers/infiniband/hw/hfi1/affinity.c
> @@ -1047,16 +1047,16 @@ int hfi1_get_proc_affinity(int node)
>  	 */
>  
>  	ret = zalloc_cpumask_var(&diff, GFP_KERNEL);
> -	if (!ret)
> +	if (!ret || unlikely(!diff))

Why do you think that check of "ret" is not enough?
"ret" will be false if diff == NULL.

Thanks

xiaolinkui April 4, 2023, 7:01 a.m. UTC | #2

Thanks for your reply.

When CONFIG_CPUMASK_OFFSTACK=y, "ret" will be false if diff==NULL.

However, when CONFIG_CPUMASK_OFFSTACK=n, these two are not necessarily 
equivalent.

Thanks

On 4/4/23 14:05, Leon Romanovsky wrote:
> On Tue, Apr 04, 2023 at 11:05:25AM +0800, xiaolinkui wrote:
>> From: Linkui Xiao <xiaolinkui@kylinos.cn>
>>
>> When CONFIG_CPUMASK_OFFSTACK is n, cpumask may fail to allocate, cpumask may
>> be NULL, and performing a bitmap operation on cpumask may cause problems at
>> this time.
>>
>> Of course, this is a unlikely event.
>>
>> Signed-off-by: Linkui Xiao <xiaolinkui@kylinos.cn>
>> ---
>>   drivers/infiniband/hw/hfi1/affinity.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
>> index 77ee77d4000f..3caa861f4d1d 100644
>> --- a/drivers/infiniband/hw/hfi1/affinity.c
>> +++ b/drivers/infiniband/hw/hfi1/affinity.c
>> @@ -1047,16 +1047,16 @@ int hfi1_get_proc_affinity(int node)
>>   	 */
>>   
>>   	ret = zalloc_cpumask_var(&diff, GFP_KERNEL);
>> -	if (!ret)
>> +	if (!ret || unlikely(!diff))
> Why do you think that check of "ret" is not enough?
> "ret" will be false if diff == NULL.
>
> Thanks

Leon Romanovsky April 4, 2023, 8:02 a.m. UTC | #3

On Tue, Apr 04, 2023 at 03:01:58PM +0800, xiaolinkui wrote:
> Thanks for your reply.
> 
> When CONFIG_CPUMASK_OFFSTACK=y, "ret" will be false if diff==NULL.
> 
> However, when CONFIG_CPUMASK_OFFSTACK=n, these two are not necessarily
> equivalent.

And what is the problem with that? cpumask_* API is built to handle
correctly this mode.

Reminder: please don't top post.

> 
> Thanks

kernel test robot April 7, 2023, 4:07 a.m. UTC | #4

Hi xiaolinkui,

kernel test robot noticed the following build warnings:

[auto build test WARNING on rdma/for-next]
[also build test WARNING on linus/master v6.3-rc5 next-20230406]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/xiaolinkui/RDMA-hfi-add-a-judgment-on-the-availability-of-cpumask/20230404-113847
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next
patch link:    https://lore.kernel.org/r/20230404030525.24020-1-xiaolinkui%40126.com
patch subject: [PATCH] RDMA/hfi: add a judgment on the availability of cpumask
config: x86_64-randconfig-a003-20220117 (https://download.01.org/0day-ci/archive/20230407/202304071125.xK1fezQ1-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/206c6fc9aa5afd354f4201216ca8c2c0057fb49d
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review xiaolinkui/RDMA-hfi-add-a-judgment-on-the-availability-of-cpumask/20230404-113847
        git checkout 206c6fc9aa5afd354f4201216ca8c2c0057fb49d
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 olddefconfig
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/infiniband/hw/hfi1/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304071125.xK1fezQ1-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from include/asm-generic/bug.h:5,
                    from arch/x86/include/asm/bug.h:87,
                    from include/linux/bug.h:5,
                    from include/linux/mmdebug.h:5,
                    from include/linux/percpu.h:5,
                    from include/linux/arch_topology.h:9,
                    from include/linux/topology.h:30,
                    from drivers/infiniband/hw/hfi1/affinity.c:6:
   drivers/infiniband/hw/hfi1/affinity.c: In function 'hfi1_get_proc_affinity':
>> drivers/infiniband/hw/hfi1/affinity.c:1050:30: warning: the address of 'diff' will always evaluate as 'true' [-Waddress]
    1050 |         if (!ret || unlikely(!diff))
         |                              ^
   include/linux/compiler.h:78:45: note: in definition of macro 'unlikely'
      78 | # define unlikely(x)    __builtin_expect(!!(x), 0)
         |                                             ^
>> drivers/infiniband/hw/hfi1/affinity.c:1053:30: warning: the address of 'hw_thread_mask' will always evaluate as 'true' [-Waddress]
    1053 |         if (!ret || unlikely(!hw_thread_mask))
         |                              ^
   include/linux/compiler.h:78:45: note: in definition of macro 'unlikely'
      78 | # define unlikely(x)    __builtin_expect(!!(x), 0)
         |                                             ^
>> drivers/infiniband/hw/hfi1/affinity.c:1056:30: warning: the address of 'available_mask' will always evaluate as 'true' [-Waddress]
    1056 |         if (!ret || unlikely(!available_mask))
         |                              ^
   include/linux/compiler.h:78:45: note: in definition of macro 'unlikely'
      78 | # define unlikely(x)    __builtin_expect(!!(x), 0)
         |                                             ^
>> drivers/infiniband/hw/hfi1/affinity.c:1059:30: warning: the address of 'intrs_mask' will always evaluate as 'true' [-Waddress]
    1059 |         if (!ret || unlikely(!intrs_mask))
         |                              ^
   include/linux/compiler.h:78:45: note: in definition of macro 'unlikely'
      78 | # define unlikely(x)    __builtin_expect(!!(x), 0)
         |                                             ^


vim +1050 drivers/infiniband/hw/hfi1/affinity.c

   995	
   996	int hfi1_get_proc_affinity(int node)
   997	{
   998		int cpu = -1, ret, i;
   999		struct hfi1_affinity_node *entry;
  1000		cpumask_var_t diff, hw_thread_mask, available_mask, intrs_mask;
  1001		const struct cpumask *node_mask,
  1002			*proc_mask = current->cpus_ptr;
  1003		struct hfi1_affinity_node_list *affinity = &node_affinity;
  1004		struct cpu_mask_set *set = &affinity->proc;
  1005	
  1006		/*
  1007		 * check whether process/context affinity has already
  1008		 * been set
  1009		 */
  1010		if (current->nr_cpus_allowed == 1) {
  1011			hfi1_cdbg(PROC, "PID %u %s affinity set to CPU %*pbl",
  1012				  current->pid, current->comm,
  1013				  cpumask_pr_args(proc_mask));
  1014			/*
  1015			 * Mark the pre-set CPU as used. This is atomic so we don't
  1016			 * need the lock
  1017			 */
  1018			cpu = cpumask_first(proc_mask);
  1019			cpumask_set_cpu(cpu, &set->used);
  1020			goto done;
  1021		} else if (current->nr_cpus_allowed < cpumask_weight(&set->mask)) {
  1022			hfi1_cdbg(PROC, "PID %u %s affinity set to CPU set(s) %*pbl",
  1023				  current->pid, current->comm,
  1024				  cpumask_pr_args(proc_mask));
  1025			goto done;
  1026		}
  1027	
  1028		/*
  1029		 * The process does not have a preset CPU affinity so find one to
  1030		 * recommend using the following algorithm:
  1031		 *
  1032		 * For each user process that is opening a context on HFI Y:
  1033		 *  a) If all cores are filled, reinitialize the bitmask
  1034		 *  b) Fill real cores first, then HT cores (First set of HT
  1035		 *     cores on all physical cores, then second set of HT core,
  1036		 *     and, so on) in the following order:
  1037		 *
  1038		 *     1. Same NUMA node as HFI Y and not running an IRQ
  1039		 *        handler
  1040		 *     2. Same NUMA node as HFI Y and running an IRQ handler
  1041		 *     3. Different NUMA node to HFI Y and not running an IRQ
  1042		 *        handler
  1043		 *     4. Different NUMA node to HFI Y and running an IRQ
  1044		 *        handler
  1045		 *  c) Mark core as filled in the bitmask. As user processes are
  1046		 *     done, clear cores from the bitmask.
  1047		 */
  1048	
  1049		ret = zalloc_cpumask_var(&diff, GFP_KERNEL);
> 1050		if (!ret || unlikely(!diff))
  1051			goto done;
  1052		ret = zalloc_cpumask_var(&hw_thread_mask, GFP_KERNEL);
> 1053		if (!ret || unlikely(!hw_thread_mask))
  1054			goto free_diff;
  1055		ret = zalloc_cpumask_var(&available_mask, GFP_KERNEL);
> 1056		if (!ret || unlikely(!available_mask))
  1057			goto free_hw_thread_mask;
  1058		ret = zalloc_cpumask_var(&intrs_mask, GFP_KERNEL);
> 1059		if (!ret || unlikely(!intrs_mask))
  1060			goto free_available_mask;
  1061	
  1062		mutex_lock(&affinity->lock);
  1063		/*
  1064		 * If we've used all available HW threads, clear the mask and start
  1065		 * overloading.
  1066		 */
  1067		_cpu_mask_set_gen_inc(set);
  1068	
  1069		/*
  1070		 * If NUMA node has CPUs used by interrupt handlers, include them in the
  1071		 * interrupt handler mask.
  1072		 */
  1073		entry = node_affinity_lookup(node);
  1074		if (entry) {
  1075			cpumask_copy(intrs_mask, (entry->def_intr.gen ?
  1076						  &entry->def_intr.mask :
  1077						  &entry->def_intr.used));
  1078			cpumask_or(intrs_mask, intrs_mask, (entry->rcv_intr.gen ?
  1079							    &entry->rcv_intr.mask :
  1080							    &entry->rcv_intr.used));
  1081			cpumask_or(intrs_mask, intrs_mask, &entry->general_intr_mask);
  1082		}
  1083		hfi1_cdbg(PROC, "CPUs used by interrupts: %*pbl",
  1084			  cpumask_pr_args(intrs_mask));
  1085	
  1086		cpumask_copy(hw_thread_mask, &set->mask);
  1087	
  1088		/*
  1089		 * If HT cores are enabled, identify which HW threads within the
  1090		 * physical cores should be used.
  1091		 */
  1092		if (affinity->num_core_siblings > 0) {
  1093			for (i = 0; i < affinity->num_core_siblings; i++) {
  1094				find_hw_thread_mask(i, hw_thread_mask, affinity);
  1095	
  1096				/*
  1097				 * If there's at least one available core for this HW
  1098				 * thread number, stop looking for a core.
  1099				 *
  1100				 * diff will always be not empty at least once in this
  1101				 * loop as the used mask gets reset when
  1102				 * (set->mask == set->used) before this loop.
  1103				 */
  1104				cpumask_andnot(diff, hw_thread_mask, &set->used);
  1105				if (!cpumask_empty(diff))
  1106					break;
  1107			}
  1108		}
  1109		hfi1_cdbg(PROC, "Same available HW thread on all physical CPUs: %*pbl",
  1110			  cpumask_pr_args(hw_thread_mask));
  1111	
  1112		node_mask = cpumask_of_node(node);
  1113		hfi1_cdbg(PROC, "Device on NUMA %u, CPUs %*pbl", node,
  1114			  cpumask_pr_args(node_mask));
  1115	
  1116		/* Get cpumask of available CPUs on preferred NUMA */
  1117		cpumask_and(available_mask, hw_thread_mask, node_mask);
  1118		cpumask_andnot(available_mask, available_mask, &set->used);
  1119		hfi1_cdbg(PROC, "Available CPUs on NUMA %u: %*pbl", node,
  1120			  cpumask_pr_args(available_mask));
  1121	
  1122		/*
  1123		 * At first, we don't want to place processes on the same
  1124		 * CPUs as interrupt handlers. Then, CPUs running interrupt
  1125		 * handlers are used.
  1126		 *
  1127		 * 1) If diff is not empty, then there are CPUs not running
  1128		 *    non-interrupt handlers available, so diff gets copied
  1129		 *    over to available_mask.
  1130		 * 2) If diff is empty, then all CPUs not running interrupt
  1131		 *    handlers are taken, so available_mask contains all
  1132		 *    available CPUs running interrupt handlers.
  1133		 * 3) If available_mask is empty, then all CPUs on the
  1134		 *    preferred NUMA node are taken, so other NUMA nodes are
  1135		 *    used for process assignments using the same method as
  1136		 *    the preferred NUMA node.
  1137		 */
  1138		cpumask_andnot(diff, available_mask, intrs_mask);
  1139		if (!cpumask_empty(diff))
  1140			cpumask_copy(available_mask, diff);
  1141	
  1142		/* If we don't have CPUs on the preferred node, use other NUMA nodes */
  1143		if (cpumask_empty(available_mask)) {
  1144			cpumask_andnot(available_mask, hw_thread_mask, &set->used);
  1145			/* Excluding preferred NUMA cores */
  1146			cpumask_andnot(available_mask, available_mask, node_mask);
  1147			hfi1_cdbg(PROC,
  1148				  "Preferred NUMA node cores are taken, cores available in other NUMA nodes: %*pbl",
  1149				  cpumask_pr_args(available_mask));
  1150	
  1151			/*
  1152			 * At first, we don't want to place processes on the same
  1153			 * CPUs as interrupt handlers.
  1154			 */
  1155			cpumask_andnot(diff, available_mask, intrs_mask);
  1156			if (!cpumask_empty(diff))
  1157				cpumask_copy(available_mask, diff);
  1158		}
  1159		hfi1_cdbg(PROC, "Possible CPUs for process: %*pbl",
  1160			  cpumask_pr_args(available_mask));
  1161	
  1162		cpu = cpumask_first(available_mask);
  1163		if (cpu >= nr_cpu_ids) /* empty */
  1164			cpu = -1;
  1165		else
  1166			cpumask_set_cpu(cpu, &set->used);
  1167	
  1168		mutex_unlock(&affinity->lock);
  1169		hfi1_cdbg(PROC, "Process assigned to CPU %d", cpu);
  1170	
  1171		free_cpumask_var(intrs_mask);
  1172	free_available_mask:
  1173		free_cpumask_var(available_mask);
  1174	free_hw_thread_mask:
  1175		free_cpumask_var(hw_thread_mask);
  1176	free_diff:
  1177		free_cpumask_var(diff);
  1178	done:
  1179		return cpu;
  1180	}
  1181

RDMA/hfi: add a judgment on the availability of cpumask

Commit Message

Comments

Patch