diff mbox series

[v3,4/9] cpumask: Introduce for_each_cpu_andnot()

Message ID 20220825181210.284283-5-vschneid@redhat.com (mailing list archive)
State Superseded
Headers show
Series sched, net: NUMA-aware CPU spreading interface | expand

Commit Message

Valentin Schneider Aug. 25, 2022, 6:12 p.m. UTC
for_each_cpu_and() is very convenient as it saves having to allocate a
temporary cpumask to store the result of cpumask_and(). The same issue
applies to cpumask_andnot() which doesn't actually need temporary storage
for iteration purposes.

Following what has been done for for_each_cpu_and(), introduce
for_each_cpu_andnot().

Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 include/linux/cpumask.h | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

Comments

Yury Norov Aug. 25, 2022, 9:14 p.m. UTC | #1
On Thu, Aug 25, 2022 at 07:12:05PM +0100, Valentin Schneider wrote:
> for_each_cpu_and() is very convenient as it saves having to allocate a
> temporary cpumask to store the result of cpumask_and(). The same issue
> applies to cpumask_andnot() which doesn't actually need temporary storage
> for iteration purposes.
> 
> Following what has been done for for_each_cpu_and(), introduce
> for_each_cpu_andnot().
> 
> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> ---
>  include/linux/cpumask.h | 39 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index 1414ce8cd003..372a642bf9ba 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -238,6 +238,25 @@ unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
>  		nr_cpumask_bits, n + 1);
>  }
>  
> +/**
> + * cpumask_next_andnot - get the next cpu in *src1p & ~*src2p
> + * @n: the cpu prior to the place to search (ie. return will be > @n)
> + * @src1p: the first cpumask pointer
> + * @src2p: the second cpumask pointer
> + *
> + * Returns >= nr_cpu_ids if no further cpus set in *src1p & ~*src2p
> + */
> +static inline
> +unsigned int cpumask_next_andnot(int n, const struct cpumask *src1p,
> +				 const struct cpumask *src2p)
> +{
> +	/* -1 is a legal arg here. */
> +	if (n != -1)
> +		cpumask_check(n);
> +	return find_next_andnot_bit(cpumask_bits(src1p), cpumask_bits(src2p),
> +		nr_cpumask_bits, n + 1);
> +}
> +
>  /**
>   * for_each_cpu - iterate over every cpu in a mask
>   * @cpu: the (optionally unsigned) integer iterator
> @@ -317,6 +336,26 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
>  		(cpu) = cpumask_next_and((cpu), (mask1), (mask2)),	\
>  		(cpu) < nr_cpu_ids;)
>  
> +/**
> + * for_each_cpu_andnot - iterate over every cpu present in one mask, excluding
> + *			 those present in another.
> + * @cpu: the (optionally unsigned) integer iterator
> + * @mask1: the first cpumask pointer
> + * @mask2: the second cpumask pointer
> + *
> + * This saves a temporary CPU mask in many places.  It is equivalent to:
> + *	struct cpumask tmp;
> + *	cpumask_andnot(&tmp, &mask1, &mask2);
> + *	for_each_cpu(cpu, &tmp)
> + *		...
> + *
> + * After the loop, cpu is >= nr_cpu_ids.
> + */
> +#define for_each_cpu_andnot(cpu, mask1, mask2)				\
> +	for ((cpu) = -1;						\
> +		(cpu) = cpumask_next_andnot((cpu), (mask1), (mask2)),	\
> +		(cpu) < nr_cpu_ids;)

The standard doesn't guarantee the order of execution of last 2 lines,
so you might end up with unreliable code. Can you do it in a more
conventional style:
   #define for_each_cpu_andnot(cpu, mask1, mask2)			\
   	for ((cpu) = cpumask_next_andnot(-1, (mask1), (mask2));	        \
   		(cpu) < nr_cpu_ids;                                     \
   		(cpu) = cpumask_next_andnot((cpu), (mask1), (mask2)))	

> +
>  /**
>   * cpumask_any_but - return a "random" in a cpumask, but not this one.
>   * @mask: the cpumask to search
> -- 
> 2.31.1
Valentin Schneider Sept. 5, 2022, 4:44 p.m. UTC | #2
On 25/08/22 14:14, Yury Norov wrote:
> On Thu, Aug 25, 2022 at 07:12:05PM +0100, Valentin Schneider wrote:
>> +#define for_each_cpu_andnot(cpu, mask1, mask2)				\
>> +	for ((cpu) = -1;						\
>> +		(cpu) = cpumask_next_andnot((cpu), (mask1), (mask2)),	\
>> +		(cpu) < nr_cpu_ids;)
>
> The standard doesn't guarantee the order of execution of last 2 lines,
> so you might end up with unreliable code. Can you do it in a more
> conventional style:
>    #define for_each_cpu_andnot(cpu, mask1, mask2)			\
>       for ((cpu) = cpumask_next_andnot(-1, (mask1), (mask2));         \
>               (cpu) < nr_cpu_ids;                                     \
>               (cpu) = cpumask_next_andnot((cpu), (mask1), (mask2)))
>

IIUC the order of execution *is* guaranteed as this is a comma operator,
not argument passing:

  6.5.17 Comma operator

  The left operand of a comma operator is evaluated as a void expression;
  there is a sequence point after its evaluation. Then the right operand is
  evaluated; the result has its type and value.

for_each_cpu{_and}() uses the same pattern (which I simply copied here).

Still, I'd be up for making this a bit more readable. I did a bit of
digging to figure out how we ended up with that pattern, and found

  7baac8b91f98 ("cpumask: make for_each_cpu_mask a bit smaller")

so this appears to have been done to save up on generated instructions.
*if* it is actually OK standard-wise, I'd vote to leave it as-is.

>> +
>>  /**
>>   * cpumask_any_but - return a "random" in a cpumask, but not this one.
>>   * @mask: the cpumask to search
>> --
>> 2.31.1
Yury Norov Sept. 5, 2022, 6:33 p.m. UTC | #3
On Mon, Sep 5, 2022 at 9:44 AM Valentin Schneider <vschneid@redhat.com> wrote:
>
> On 25/08/22 14:14, Yury Norov wrote:
> > On Thu, Aug 25, 2022 at 07:12:05PM +0100, Valentin Schneider wrote:
> >> +#define for_each_cpu_andnot(cpu, mask1, mask2)                              \
> >> +    for ((cpu) = -1;                                                \
> >> +            (cpu) = cpumask_next_andnot((cpu), (mask1), (mask2)),   \
> >> +            (cpu) < nr_cpu_ids;)
> >
> > The standard doesn't guarantee the order of execution of last 2 lines,
> > so you might end up with unreliable code. Can you do it in a more
> > conventional style:
> >    #define for_each_cpu_andnot(cpu, mask1, mask2)                     \
> >       for ((cpu) = cpumask_next_andnot(-1, (mask1), (mask2));         \
> >               (cpu) < nr_cpu_ids;                                     \
> >               (cpu) = cpumask_next_andnot((cpu), (mask1), (mask2)))
> >
>
> IIUC the order of execution *is* guaranteed as this is a comma operator,
> not argument passing:
>
>   6.5.17 Comma operator
>
>   The left operand of a comma operator is evaluated as a void expression;
>   there is a sequence point after its evaluation. Then the right operand is
>   evaluated; the result has its type and value.
>
> for_each_cpu{_and}() uses the same pattern (which I simply copied here).
>
> Still, I'd be up for making this a bit more readable. I did a bit of
> digging to figure out how we ended up with that pattern, and found
>
>   7baac8b91f98 ("cpumask: make for_each_cpu_mask a bit smaller")
>
> so this appears to have been done to save up on generated instructions.
> *if* it is actually OK standard-wise, I'd vote to leave it as-is.

Indeed. I probably messed with ANSI C.

Sorry for the noise.
diff mbox series

Patch

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 1414ce8cd003..372a642bf9ba 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -238,6 +238,25 @@  unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
 		nr_cpumask_bits, n + 1);
 }
 
+/**
+ * cpumask_next_andnot - get the next cpu in *src1p & ~*src2p
+ * @n: the cpu prior to the place to search (ie. return will be > @n)
+ * @src1p: the first cpumask pointer
+ * @src2p: the second cpumask pointer
+ *
+ * Returns >= nr_cpu_ids if no further cpus set in *src1p & ~*src2p
+ */
+static inline
+unsigned int cpumask_next_andnot(int n, const struct cpumask *src1p,
+				 const struct cpumask *src2p)
+{
+	/* -1 is a legal arg here. */
+	if (n != -1)
+		cpumask_check(n);
+	return find_next_andnot_bit(cpumask_bits(src1p), cpumask_bits(src2p),
+		nr_cpumask_bits, n + 1);
+}
+
 /**
  * for_each_cpu - iterate over every cpu in a mask
  * @cpu: the (optionally unsigned) integer iterator
@@ -317,6 +336,26 @@  unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
 		(cpu) = cpumask_next_and((cpu), (mask1), (mask2)),	\
 		(cpu) < nr_cpu_ids;)
 
+/**
+ * for_each_cpu_andnot - iterate over every cpu present in one mask, excluding
+ *			 those present in another.
+ * @cpu: the (optionally unsigned) integer iterator
+ * @mask1: the first cpumask pointer
+ * @mask2: the second cpumask pointer
+ *
+ * This saves a temporary CPU mask in many places.  It is equivalent to:
+ *	struct cpumask tmp;
+ *	cpumask_andnot(&tmp, &mask1, &mask2);
+ *	for_each_cpu(cpu, &tmp)
+ *		...
+ *
+ * After the loop, cpu is >= nr_cpu_ids.
+ */
+#define for_each_cpu_andnot(cpu, mask1, mask2)				\
+	for ((cpu) = -1;						\
+		(cpu) = cpumask_next_andnot((cpu), (mask1), (mask2)),	\
+		(cpu) < nr_cpu_ids;)
+
 /**
  * cpumask_any_but - return a "random" in a cpumask, but not this one.
  * @mask: the cpumask to search