[v5,0/7] bits: Fixed-type GENMASK()/BIT()

Message ID	20250306-fixed-type-genmasks-v5-0-b443e9dcba63@wanadoo.fr (mailing list archive)
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Vincent Mailhol via B4 Relay <devnull+mailhol.vincent.wanadoo.fr@kernel.org> Subject: [PATCH v5 0/7] bits: Fixed-type GENMASK()/BIT() Date: Thu, 06 Mar 2025 20:29:51 +0900 Message-Id: <20250306-fixed-type-genmasks-v5-0-b443e9dcba63@wanadoo.fr> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit To: Yury Norov <yury.norov@gmail.com>, Lucas De Marchi <lucas.demarchi@intel.com>, Rasmus Villemoes <linux@rasmusvillemoes.dk>, Jani Nikula <jani.nikula@linux.intel.com>, Joonas Lahtinen <joonas.lahtinen@linux.intel.com>, Rodrigo Vivi <rodrigo.vivi@intel.com>, Tvrtko Ursulin <tursulin@ursulin.net>, David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>, Andrew Morton <akpm@linux-foundation.org> Cc: linux-kernel@vger.kernel.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, Andi Shyti <andi.shyti@linux.intel.com>, David Laight <David.Laight@ACULAB.COM>, Dmitry Baryshkov <dmitry.baryshkov@linaro.org>, Andy Shevchenko <andriy.shevchenko@linux.intel.com>, Vincent Mailhol <mailhol.vincent@wanadoo.fr>, Jani Nikula <jani.nikula@intel.com> Precedence: list Reply-To: mailhol.vincent@wanadoo.fr Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	bits: Fixed-type GENMASK()/BIT() \| expand [v5,0/7] bits: Fixed-type GENMASK()/BIT() [v5,1/7] bits: split the definition of the asm and non-asm GENMASK() [v5,2/7] bits: introduce fixed-type genmasks [v5,3/7] bits: introduce fixed-type BIT_U() [v5,4/7] drm/i915: Convert REG_GENMASK() to fixed-width GENMASK_U() [v5,5/7] test_bits: add tests for __GENMASK() and __GENMASK_ULL() [v5,6/7] test_bits: add tests for GENMASK_U() [v5,7/7] test_bits: add tests for BIT_U*()

Vincent Mailhol via B4 Relay March 6, 2025, 11:29 a.m. UTC

Introduce some fixed width variant of the GENMASK() and the BIT()
macros in bits.h. Note that the main goal is not to get the correct
type, but rather to enforce more checks at compile time. For example:

  GENMASK_U16(16, 0)

will raise a build bug.

This series is a continuation of:

  https://lore.kernel.org/intel-xe/20240208074521.577076-1-lucas.demarchi@intel.com

from Lucas De Marchi. Above series is one year old. I really think
that this was a good idea and I do not want this series to die. So I
am volunteering to revive it.

Meanwhile, many changes occurred in bits.h. The most significant
change is that __GENMASK() was moved to the uapi headers.

In v4 an onward, I introduce one big change: split the definition of
the asm and non-asm GENMASK(). I think this is controversial.
Especially, Yury commented that he did not want such split. So I
initially implemented a first draft in which both the asm and non-asm
version would rely on the same helper macro, i.e. adding this:

  #define __GENMASK_t(t, w, h, l)			\
  	(((t)~_ULL(0) - ((t)1 << (l)) + 1) &		\
  	 ((t)~_ULL(0) >> (w - 1 - (h))))
    
to uapi/bits.h. And then, the different GENMASK()s would look like
this:

  #define __GENMASK(h, l) __GENMASK_t(unsigned long, __BITS_PER_LONG, h, l)
    
and so on.
    
I implemented it, and the final result looks quite ugly. Not only do
we need to manually provide the width each time, the biggest concern
is that adding this to the uapi is asking for trouble. Who knows how
people are going to use this? And once it is in the uapi, there is
virtually no way back.

Finally, I do not think it makes sense to expose the fixed width
variants to the asm. The fixed width integers type are a C
concept. For asm, the long and long long variants seems sufficient.

And so, after implementing both, the asm and non-asm split seems way
more clean and I think this is the best compromise. Let me know what
you think :)

As requested, here are the bloat-o-meter stats:

  $ ./scripts/bloat-o-meter vmlinux_before.o vmlinux_after.o 
  add/remove: 0/0 grow/shrink: 4/2 up/down: 5/-4 (1)
  Function                                     old     new   delta
  intel_psr_invalidate                         666     668      +2
  mst_stream_compute_config                   1652    1653      +1
  intel_psr_flush                              977     978      +1
  intel_dp_compute_link_config                1327    1328      +1
  cfg80211_inform_bss_data                    5109    5108      -1
  intel_drrs_activate                          379     376      -3
  Total: Before=22723481, After=22723482, chg +0.00%

(done with GCC 12.4.1 on a defconfig)

--
2.43.0

---
Changes from v4:

  - Rebase on https://github.com/norov/linux/tree/bitmap-for-next

  - Rename GENMASK_t() to GENMASK_TYPE()

  - First patch of v4 (the typo fix 'init128' -> 'int128') is removed
    because it was resent separately in:
    https://lore.kernel.org/all/20250305-fix_init128_typo-v1-1-cbe5b8e54e7d@wanadoo.fr

  - Replace the (t)~ULL(0) by type_max(t). This way, GENMASK_TYPE()
    can now be used to generate GENMASK_U128().

  - Get rid of the unsigned int cast for the U8 and U16 variants.

  - Add the BIT_TYPE() helper macro.

  - Link to v4: https://lore.kernel.org/r/20250305-fixed-type-genmasks-v4-0-1873dcdf6723@wanadoo.fr

Changes from v3:

  - Rebase on v6.14-rc5

  - Fix a typo in GENMASK_U128() comment.

  - Split the asm and non-asm definition of 

  - Replace ~0ULL by ~ULL(0)

  - Since v3, __GENMASK() was moved to the uapi and people started
    using directly. Introduce GENMASK_t() instead.

  - Link to v3: https://lore.kernel.org/intel-xe/20240208074521.577076-1-lucas.demarchi@intel.com

Changes from v2:

  - Document both in commit message and code about the strict type
    checking and give examples how it´d break with invalid params.

  - Link to v2: https://lore.kernel.org/intel-xe/20240124050205.3646390-1-lucas.demarchi@intel.com

Link to v1: https://lore.kernel.org/intel-xe/20230509051403.2748545-1-lucas.demarchi@intel.com

---
Lucas De Marchi (3):
      bits: introduce fixed-type BIT_U*()
      drm/i915: Convert REG_GENMASK*() to fixed-width GENMASK_U*()
      test_bits: add tests for GENMASK_U*()

Vincent Mailhol (3):
      bits: split the definition of the asm and non-asm GENMASK()
      test_bits: add tests for __GENMASK() and __GENMASK_ULL()
      test_bits: add tests for BIT_U*()

Yury Norov (1):
      bits: introduce fixed-type genmasks

 drivers/gpu/drm/i915/i915_reg_defs.h | 108 ++++-------------------------------
 include/linux/bitops.h               |   1 -
 include/linux/bits.h                 |  77 ++++++++++++++++++-------
 lib/test_bits.c                      |  47 +++++++++++++++
 4 files changed, 113 insertions(+), 120 deletions(-)
---
base-commit: 0312e94abe484b9ee58c32d2f8ba177e04955b35
change-id: 20250228-fixed-type-genmasks-8d1a555f34e8

Best regards,

Vincent Mailhol March 6, 2025, 2:56 p.m. UTC | #1

On 06/03/2025 at 22:02, Andy Shevchenko wrote:
> On Thu, Mar 06, 2025 at 08:29:51PM +0900, Vincent Mailhol via B4 Relay wrote:
>> Introduce some fixed width variant of the GENMASK() and the BIT()
>> macros in bits.h. Note that the main goal is not to get the correct
>> type, but rather to enforce more checks at compile time. For example:
>>
>>   GENMASK_U16(16, 0)
>>
>> will raise a build bug.
>>
>> This series is a continuation of:
>>
>>   https://lore.kernel.org/intel-xe/20240208074521.577076-1-lucas.demarchi@intel.com
>>
>> from Lucas De Marchi. Above series is one year old. I really think
>> that this was a good idea and I do not want this series to die. So I
>> am volunteering to revive it.
>>
>> Meanwhile, many changes occurred in bits.h. The most significant
>> change is that __GENMASK() was moved to the uapi headers.
>>
>> In v4 an onward, I introduce one big change: split the definition of
>> the asm and non-asm GENMASK(). I think this is controversial.
>> Especially, Yury commented that he did not want such split. So I
>> initially implemented a first draft in which both the asm and non-asm
>> version would rely on the same helper macro, i.e. adding this:
>>
>>   #define __GENMASK_t(t, w, h, l)			\
> 
> I thought we agreed on renaming...
> 
>>   	(((t)~_ULL(0) - ((t)1 << (l)) + 1) &		\
>>   	 ((t)~_ULL(0) >> (w - 1 - (h))))
>>     
>> to uapi/bits.h. And then, the different GENMASK()s would look like
>> this:
>>
>>   #define __GENMASK(h, l) __GENMASK_t(unsigned long, __BITS_PER_LONG, h, l)
> 
> Ditto.

I forgot to update the cover letter, my bad…

>> and so on.
>>     
>> I implemented it, and the final result looks quite ugly. Not only do
>> we need to manually provide the width each time, the biggest concern
>> is that adding this to the uapi is asking for trouble. Who knows how
>> people are going to use this? And once it is in the uapi, there is
>> virtually no way back.
>>
>> Finally, I do not think it makes sense to expose the fixed width
>> variants to the asm. The fixed width integers type are a C
>> concept. For asm, the long and long long variants seems sufficient.
>>
>> And so, after implementing both, the asm and non-asm split seems way
>> more clean and I think this is the best compromise. Let me know what
>> you think :)
>>
>> As requested, here are the bloat-o-meter stats:
>>
>>   $ ./scripts/bloat-o-meter vmlinux_before.o vmlinux_after.o 
>>   add/remove: 0/0 grow/shrink: 4/2 up/down: 5/-4 (1)
>>   Function                                     old     new   delta
>>   intel_psr_invalidate                         666     668      +2
>>   mst_stream_compute_config                   1652    1653      +1
>>   intel_psr_flush                              977     978      +1
>>   intel_dp_compute_link_config                1327    1328      +1
>>   cfg80211_inform_bss_data                    5109    5108      -1
>>   intel_drrs_activate                          379     376      -3
>>   Total: Before=22723481, After=22723482, chg +0.00%
>>
>> (done with GCC 12.4.1 on a defconfig)
> 
> What defconfig? x86_64_defconfig?

Yes, x86_64 defconfig.


Yours sincerely,
Vincent Mailhol

Vincent Mailhol March 6, 2025, 4:08 p.m. UTC | #2

On 06/03/2025 à 22:08, Andy Shevchenko wrote:
> On Thu, Mar 06, 2025 at 08:29:53PM +0900, Vincent Mailhol via B4 Relay wrote:
>> From: Yury Norov <yury.norov@gmail.com>
>>
>> Add GENMASK_TYPE() which generalizes __GENMASK() to support different
>> types, and implement fixed-types versions of GENMASK() based on it.
>> The fixed-type version allows more strict checks to the min/max values
>> accepted, which is useful for defining registers like implemented by
>> i915 and xe drivers with their REG_GENMASK*() macros.
>>
>> The strict checks rely on shift-count-overflow compiler check to fail
>> the build if a number outside of the range allowed is passed.
>> Example:
>>
>>   #define FOO_MASK GENMASK_U32(33, 4)
>>
>> will generate a warning like:
>>
>>   include/linux/bits.h:51:27: error: right shift count >= width of type [-Werror=shift-count-overflow]
>>      51 |               type_max(t) >> (BITS_PER_TYPE(t) - 1 - (h)))))
>>         |                           ^~
> 
> Code LGTM

Does this mean I get your Reviewed-by tag? Or will you wait the v6 to
formally give it?

> but just to be sure: you prepared your series using histogram
> diff algo, right?

No, I never used the histogram diff. My git config is extremely boring.
Mostly vanilla.

I remember that Linus even commented on this:

https://lore.kernel.org/all/CAHk-=wiUxm-NZ1si8dXWVTTJ9n3c+1SRTC0V+Lk7hOE4bDVwJQ@mail.gmail.com/

But he made it clear this was *not* a requirement, so I just left the
diff algorithm to the default. Or did I miss any communication that
contributors should now use histogram diff?

Regardless, I do not mind activating it. I just did a:

  git config diff.algorithm histogram

The v6 will have histogram diffs.


Yours sincerely,
Vincent Mailhol

Vincent Mailhol March 6, 2025, 4:08 p.m. UTC | #3

On 06/03/2025 at 22:11, Andy Shevchenko wrote:
> On Thu, Mar 06, 2025 at 08:29:58PM +0900, Vincent Mailhol via B4 Relay wrote:
>> From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
>>
>> Add some additional tests in lib/test_bits.c to cover the expected
>> results of the fixed type BIT_U*() macros.
> 
> Still would be good to have a small assembly test case for GENMASK*() as they
> went split and it will be a good regression test in case somebody decides to
> unify both without much thinking..

Let me confirm that I correctly understood your ask. Would something
like this meet your expectations?

diff --git a/lib/test_bits.c b/lib/test_bits.c
index 72984fae7b81..869b291587e6 100644
--- a/lib/test_bits.c
+++ b/lib/test_bits.c
@@ -136,6 +136,29 @@ static void genmask_input_check_test(struct kunit
*test)
 	KUNIT_EXPECT_EQ(test, 0, GENMASK_INPUT_CHECK(127, 0));
 }

+#undef __LINUX_BITS_H
+#undef GENMASK
+#undef GENMASK_ULL
+#define __ASSEMBLY__
+#include <linux/bits.h>
+static void asm_genmask_test(struct kunit *test)
+{
+	KUNIT_EXPECT_EQ(test, 1ul, GENMASK(0, 0));
+	KUNIT_EXPECT_EQ(test, 3ul, GENMASK(1, 0));
+	KUNIT_EXPECT_EQ(test, 6ul, GENMASK(2, 1));
+	KUNIT_EXPECT_EQ(test, 0xFFFFFFFFul, GENMASK(31, 0));
+}
+
+static void asm_genmask_ull_test(struct kunit *test)
+{
+	KUNIT_EXPECT_EQ(test, 1ull, GENMASK_ULL(0, 0));
+	KUNIT_EXPECT_EQ(test, 3ull, GENMASK_ULL(1, 0));
+	KUNIT_EXPECT_EQ(test, 0x000000ffffe00000ull, GENMASK_ULL(39, 21));
+	KUNIT_EXPECT_EQ(test, 0xffffffffffffffffull, GENMASK_ULL(63, 0));
+}
+#undef __ASSEMBLY__
+#undef GENMASK
+#undef GENMASK_ULL

 static struct kunit_case bits_test_cases[] = {
        KUNIT_CASE(__genmask_test),
@@ -144,6 +167,8 @@ static struct kunit_case bits_test_cases[] = {
 	KUNIT_CASE(genmask_ull_test),
 	KUNIT_CASE(genmask_u128_test),
 	KUNIT_CASE(genmask_input_check_test),
+	KUNIT_CASE(asm_genmask_test),
+	KUNIT_CASE(asm_genmask_ull_test),
 	{}
 };


Yours sincerely,
Vincent Mailhol

Vincent Mailhol March 6, 2025, 5:10 p.m. UTC | #4

On 06/03/2025 at 23:34, Lucas De Marchi wrote:
> On Thu, Mar 06, 2025 at 08:29:52PM +0900, Vincent Mailhol via B4 Relay
> wrote:

(...)

> it seems we now have 1 inconsistency that we comment why
> GENMASK_U128() is not available in asm, but we don't comment why
> GENMASK_INPUT_CHECK() is not available there. Maybe move this comment on
> top of GENMASK_INPUT_CHECK().

I will restore the comment in v6 and put it next to the asm definition,
c.f. my reply to Andy.

> Anyway,
> 
>     Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

Is this only valid for the first patch or for the full series? If this
is for the full series, would you mind replying to the cover letter with
your review tag?

> thanks for picking up this series.

You are welcome!

> Lucas De Marchi

(...)

Yours sincerely,
Vincent Mailhol

Vincent Mailhol March 7, 2025, 9:58 a.m. UTC | #5

On 07/03/2025 at 04:23, David Laight wrote:
> On Thu, 06 Mar 2025 20:29:52 +0900
> Vincent Mailhol via B4 Relay <devnull+mailhol.vincent.wanadoo.fr@kernel.org> wrote:
> 
>> From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
>>
>> In an upcoming change, GENMASK() and its friends will indirectly
>> depend on sizeof() which is not available in asm.
>>
>> Instead of adding further complexity to __GENMASK() to make it work
>> for both asm and non asm, just split the definition of the two
>> variants.
> ...
>> +#else /* defined(__ASSEMBLY__) */
>> +
>> +#define GENMASK(h, l)		__GENMASK(h, l)
>> +#define GENMASK_ULL(h, l)	__GENMASK_ULL(h, l)
> 
> What do those actually expand to now?
> As I've said a few times both UL(0) and ULL(0) are just (0) for __ASSEMBLY__
> so the expansions of __GENMASK() and __GENMASK_ULL() contained the
> same numeric constants.

Indeed, in asm, the UL(0) and ULL(0) expands to the same thing: 0.

But the two macros still expand to something different on 32 bits
architectures:

  * __GENMASK:

      (((~(0)) << (l)) & (~(0) >> (32 - 1 - (h))))

  * __GENMASK_ULL:

      (((~(0)) << (l)) & (~(0) >> (64 - 1 - (h))))

On 64 bits architecture these are the same.

> This means they should be generating the same values.
> I don't know the correct 'sizeof (int_type)' for the shift right of ~0.
> My suspicion is that a 32bit assembler used 32bit signed integers and a
> 64bit one 64bit signed integers (but a 32bit asm on a 64bit host might
> be 64bit).
> So the asm versions need to avoid the right shift and only do left shifts.
> 
> Which probably means they need to be enirely separate from the C versions.
> And then the C ones can have all the ULL() removed.

In this v5, I already have the ULL() removed from the non-uapi C
version. And we are left with two distinct variants:

  - the uapi C & asm
  - the non-uapi C (including fix width)

For the uapi ones, I do not think we can modify it without a risk of
breaking some random userland. At least, this is not a risk I will take.
And if we have to keep the __GENMASK() and __GENMASK_ULL(), then I would
rather just reuse these for the asm variant instead of splitting further
more and finding ourselves with three variants:

  - the uapi C
  - the asm
  - the non-uapi C (including fix width)

If __GENMASK() and __GENMASK_ULL() were not in the uapi, I would have
agreed with you.

If you believe that the risk of modifying the uapi GENMASK*() is low
enough, then you can submit a patch. But I will definitely not touch
these myself.

Yours sincerely,
Vincent Mailhol

David Laight March 7, 2025, 1:27 p.m. UTC | #6

On Fri, 7 Mar 2025 18:58:08 +0900
Vincent Mailhol <mailhol.vincent@wanadoo.fr> wrote:

> On 07/03/2025 at 04:23, David Laight wrote:
> > On Thu, 06 Mar 2025 20:29:52 +0900
> > Vincent Mailhol via B4 Relay <devnull+mailhol.vincent.wanadoo.fr@kernel.org> wrote:
> >   
> >> From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
> >>
> >> In an upcoming change, GENMASK() and its friends will indirectly
> >> depend on sizeof() which is not available in asm.
> >>
> >> Instead of adding further complexity to __GENMASK() to make it work
> >> for both asm and non asm, just split the definition of the two
> >> variants.  
> > ...  
> >> +#else /* defined(__ASSEMBLY__) */
> >> +
> >> +#define GENMASK(h, l)		__GENMASK(h, l)
> >> +#define GENMASK_ULL(h, l)	__GENMASK_ULL(h, l)  
> > 
> > What do those actually expand to now?
> > As I've said a few times both UL(0) and ULL(0) are just (0) for __ASSEMBLY__
> > so the expansions of __GENMASK() and __GENMASK_ULL() contained the
> > same numeric constants.  
> 
> Indeed, in asm, the UL(0) and ULL(0) expands to the same thing: 0.
> 
> But the two macros still expand to something different on 32 bits
> architectures:
> 
>   * __GENMASK:
> 
>       (((~(0)) << (l)) & (~(0) >> (32 - 1 - (h))))
> 
>   * __GENMASK_ULL:
> 
>       (((~(0)) << (l)) & (~(0) >> (64 - 1 - (h))))
> 
> On 64 bits architecture these are the same.

If the assembler is naive and uses the cpu shift instruction for the >>
then a lot of cpu (including all x86 since the 286) mask off the high bits.
So __GENMASK_ULL() may well generate the expected pattern - provided it
is 32bits wide.

> 
> > This means they should be generating the same values.
> > I don't know the correct 'sizeof (int_type)' for the shift right of ~0.
> > My suspicion is that a 32bit assembler used 32bit signed integers and a
> > 64bit one 64bit signed integers (but a 32bit asm on a 64bit host might
> > be 64bit).
> > So the asm versions need to avoid the right shift and only do left shifts.
> > 
> > Which probably means they need to be enirely separate from the C versions.
> > And then the C ones can have all the ULL() removed.  
> 
> In this v5, I already have the ULL() removed from the non-uapi C
> version. And we are left with two distinct variants:
> 
>   - the uapi C & asm
>   - the non-uapi C (including fix width)
> 
> For the uapi ones, I do not think we can modify it without a risk of
> breaking some random userland. At least, this is not a risk I will take.
> And if we have to keep the __GENMASK() and __GENMASK_ULL(), then I would
> rather just reuse these for the asm variant instead of splitting further
> more and finding ourselves with three variants:
> 
>   - the uapi C
>   - the asm
>   - the non-uapi C (including fix width)
> 
> If __GENMASK() and __GENMASK_ULL() were not in the uapi, I would have
> agreed with you.
> 
> If you believe that the risk of modifying the uapi GENMASK*() is low
> enough, then you can submit a patch. But I will definitely not touch
> these myself.

I don't think you'll break userspace by stopping the uapi .h working
for asm files (with __ASSEMBLER__ defined).

But someone else might have a different opinion.

> 
> 
> Yours sincerely,
> Vincent Mailhol
>

Vincent Mailhol March 7, 2025, 3:50 p.m. UTC | #7

On 07/03/2025 at 22:27, David Laight wrote:
> On Fri, 7 Mar 2025 18:58:08 +0900
> Vincent Mailhol <mailhol.vincent@wanadoo.fr> wrote:
> 
>> On 07/03/2025 at 04:23, David Laight wrote:
>>> On Thu, 06 Mar 2025 20:29:52 +0900
>>> Vincent Mailhol via B4 Relay <devnull+mailhol.vincent.wanadoo.fr@kernel.org> wrote:
>>>   
>>>> From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>

(...)

>>>> +#define GENMASK(h, l)		__GENMASK(h, l)
>>>> +#define GENMASK_ULL(h, l)	__GENMASK_ULL(h, l)  
>>>
>>> What do those actually expand to now?
>>> As I've said a few times both UL(0) and ULL(0) are just (0) for __ASSEMBLY__
>>> so the expansions of __GENMASK() and __GENMASK_ULL() contained the
>>> same numeric constants.  
>>
>> Indeed, in asm, the UL(0) and ULL(0) expands to the same thing: 0.
>>
>> But the two macros still expand to something different on 32 bits
>> architectures:
>>
>>   * __GENMASK:
>>
>>       (((~(0)) << (l)) & (~(0) >> (32 - 1 - (h))))
>>
>>   * __GENMASK_ULL:
>>
>>       (((~(0)) << (l)) & (~(0) >> (64 - 1 - (h))))
>>
>> On 64 bits architecture these are the same.
> 
> If the assembler is naive and uses the cpu shift instruction for the >>
> then a lot of cpu (including all x86 since the 286) mask off the high bits.
> So __GENMASK_ULL() may well generate the expected pattern - provided it
> is 32bits wide.

"If", "may", that's still a lot of conditionals in your sentence :)

I do not have enough knowledge to prove or disprove this, but what I am
sure of is that this uncertainty tells me that this is not something I
want to touch myself.

I picked up this stalled fixed width patch series and re-spinned it
because I had confidence here. I do not want to extend this series with
some asm clean-up which looks uncertain to me. I am not telling you are
wrong but just that I will happily delegate this to whoever has more
confidence than me!

>>> This means they should be generating the same values.
>>> I don't know the correct 'sizeof (int_type)' for the shift right of ~0.
>>> My suspicion is that a 32bit assembler used 32bit signed integers and a
>>> 64bit one 64bit signed integers (but a 32bit asm on a 64bit host might
>>> be 64bit).
>>> So the asm versions need to avoid the right shift and only do left shifts.
>>>
>>> Which probably means they need to be enirely separate from the C versions.
>>> And then the C ones can have all the ULL() removed.  
>>
>> In this v5, I already have the ULL() removed from the non-uapi C
>> version. And we are left with two distinct variants:
>>
>>   - the uapi C & asm
>>   - the non-uapi C (including fix width)
>>
>> For the uapi ones, I do not think we can modify it without a risk of
>> breaking some random userland. At least, this is not a risk I will take.
>> And if we have to keep the __GENMASK() and __GENMASK_ULL(), then I would
>> rather just reuse these for the asm variant instead of splitting further
>> more and finding ourselves with three variants:
>>
>>   - the uapi C
>>   - the asm
>>   - the non-uapi C (including fix width)
>>
>> If __GENMASK() and __GENMASK_ULL() were not in the uapi, I would have
>> agreed with you.
>>
>> If you believe that the risk of modifying the uapi GENMASK*() is low
>> enough, then you can submit a patch. But I will definitely not touch
>> these myself.
> 
> I don't think you'll break userspace by stopping the uapi .h working
> for asm files (with __ASSEMBLER__ defined).
> 
> But someone else might have a different opinion.

How do you know that there isn't someone somewhere who is using the uapi
__GENMASK*() in their userland asm code? Wouldn't such code break?

You may argue that the uapi is not meant to be used that way, but at the
end, we are not supposed to make assumptions on how the uapi code will
be used. Once it is exposed, if something break because the uapi was not
used in the intended way, it is still the kernel fault, not the userland.


Yours sincerely,
Vincent Mailhol

Andy Shevchenko March 7, 2025, 4:07 p.m. UTC | #8

On Fri, Mar 07, 2025 at 07:11:42PM +0900, Vincent Mailhol wrote:
> On 07/03/2025 at 02:55, Andy Shevchenko wrote:
> > On Fri, Mar 07, 2025 at 01:08:15AM +0900, Vincent Mailhol wrote:
> >> On 06/03/2025 at 22:11, Andy Shevchenko wrote:
> >>> On Thu, Mar 06, 2025 at 08:29:58PM +0900, Vincent Mailhol via B4 Relay wrote:
> >>>> From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
> >>>>
> >>>> Add some additional tests in lib/test_bits.c to cover the expected
> >>>> results of the fixed type BIT_U*() macros.
> >>>
> >>> Still would be good to have a small assembly test case for GENMASK*() as they
> >>> went split and it will be a good regression test in case somebody decides to
> >>> unify both without much thinking..
> >>
> >> Let me confirm that I correctly understood your ask. Would something
> >> like this meet your expectations?
> > 
> > I believe it should be written in asm.
> 
> I am not confident enough in my assembly skills to submit asm patches to
> the kernel. So, I would rather take a pass on that one.
> 
> Regardless, if somebody decides to unify both without much thinking as
> you said, I am fully confident that the patch will get Nack-ed right

As I said above "would be good", if you think it's not feasible by you, perhaps
a comment (FIXME: ?) in the Kunit test cases that we lack of / need an asm test
as well.

Vincent Mailhol March 7, 2025, 4:47 p.m. UTC | #9

On 08/03/2025 at 01:07, Andy Shevchenko wrote:
> On Fri, Mar 07, 2025 at 07:11:42PM +0900, Vincent Mailhol wrote:
>> On 07/03/2025 at 02:55, Andy Shevchenko wrote:
>>> On Fri, Mar 07, 2025 at 01:08:15AM +0900, Vincent Mailhol wrote:
>>>> On 06/03/2025 at 22:11, Andy Shevchenko wrote:
>>>>> On Thu, Mar 06, 2025 at 08:29:58PM +0900, Vincent Mailhol via B4 Relay wrote:
>>>>>> From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
>>>>>>
>>>>>> Add some additional tests in lib/test_bits.c to cover the expected
>>>>>> results of the fixed type BIT_U*() macros.
>>>>>
>>>>> Still would be good to have a small assembly test case for GENMASK*() as they
>>>>> went split and it will be a good regression test in case somebody decides to
>>>>> unify both without much thinking..
>>>>
>>>> Let me confirm that I correctly understood your ask. Would something
>>>> like this meet your expectations?
>>>
>>> I believe it should be written in asm.
>>
>> I am not confident enough in my assembly skills to submit asm patches to
>> the kernel. So, I would rather take a pass on that one.
>>
>> Regardless, if somebody decides to unify both without much thinking as
>> you said, I am fully confident that the patch will get Nack-ed right
> 
> As I said above "would be good", if you think it's not feasible by you, perhaps
> a comment (FIXME: ?) in the Kunit test cases that we lack of / need an asm test
> as well.

Ack. I will add a FIXME message in v6.


Yours sincerely,
Vincent Mailhol

David Laight March 9, 2025, 1:58 a.m. UTC | #10

On Fri, 7 Mar 2025 18:58:08 +0900
Vincent Mailhol <mailhol.vincent@wanadoo.fr> wrote:

> On 07/03/2025 at 04:23, David Laight wrote:
> > On Thu, 06 Mar 2025 20:29:52 +0900
> > Vincent Mailhol via B4 Relay <devnull+mailhol.vincent.wanadoo.fr@kernel.org> wrote:
> >   
> >> From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
> >>
> >> In an upcoming change, GENMASK() and its friends will indirectly
> >> depend on sizeof() which is not available in asm.
> >>
> >> Instead of adding further complexity to __GENMASK() to make it work
> >> for both asm and non asm, just split the definition of the two
> >> variants.  
> > ...  
> >> +#else /* defined(__ASSEMBLY__) */
> >> +
> >> +#define GENMASK(h, l)		__GENMASK(h, l)
> >> +#define GENMASK_ULL(h, l)	__GENMASK_ULL(h, l)  
> > 
> > What do those actually expand to now?
> > As I've said a few times both UL(0) and ULL(0) are just (0) for __ASSEMBLY__
> > so the expansions of __GENMASK() and __GENMASK_ULL() contained the
> > same numeric constants.  
> 
> Indeed, in asm, the UL(0) and ULL(0) expands to the same thing: 0.
> 
> But the two macros still expand to something different on 32 bits
> architectures:
> 
>   * __GENMASK:
> 
>       (((~(0)) << (l)) & (~(0) >> (32 - 1 - (h))))
> 
>   * __GENMASK_ULL:
> 
>       (((~(0)) << (l)) & (~(0) >> (64 - 1 - (h))))
> 
> On 64 bits architecture these are the same.

I've just fed those into godbolt (https://www.godbolt.org/z/Ter6WE9qE) as:
int fi(void)
{
    int v;
    asm("mov $(((~(0)) << (8)) & (~(0) >> (32 - 1 - (15)))),%0": "=r" (v));
    return v -(((~(0u)) << (8)) & (~(0u) >> (32 - 1 - (15))));
}

gas warns:
<source>:4: Warning: 0xffffffffff00 shortened to 0xffffff00

unsigned long long fll(void)
{
    unsigned long long v;
    asm("mov $(((~(0)) << (8)) & (~(0) >> (64 - 1 - (15)))),%0": "=r" (v));
    return v -(((~(0ull)) << (8)) & (~(0ull) >> (64 - 1 - (15))));
}

(for other architectures you'll need to change the opcode)

For x86 and x86-32 the assembler seems to be doing 64bit maths with unsigned
right shifts - so the second function (with the 64 in it) generates 0xff00.
I doubt a 32bit only assembler does 64bit maths, but the '>> 48' above
might get masked to a '>> 16' by the cpu and generate the correct result.

So __GENMASK() is likely to be broken for any assembler that supports 64bits
when generating 32bit code.
__GENMASK_ULL() works (assuming all have unsigned >>) on 64bit assemblers
(even when generating 32bit code). It may work on some 32bit assemblers.

Since most uses in the header files will be GENMASK() I doubt (hope) no
asm code actually uses the values!
The headers assemble - but that is about all that can be said.

Bags of worms :-)

	David

David Laight March 9, 2025, 10:23 a.m. UTC | #11

On Sun, 9 Mar 2025 01:58:53 +0000
David Laight <david.laight.linux@gmail.com> wrote:

> On Fri, 7 Mar 2025 18:58:08 +0900
> Vincent Mailhol <mailhol.vincent@wanadoo.fr> wrote:
> 
> > On 07/03/2025 at 04:23, David Laight wrote:  
> > > On Thu, 06 Mar 2025 20:29:52 +0900
> > > Vincent Mailhol via B4 Relay <devnull+mailhol.vincent.wanadoo.fr@kernel.org> wrote:
> > >     
> > >> From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
> > >>
> > >> In an upcoming change, GENMASK() and its friends will indirectly
> > >> depend on sizeof() which is not available in asm.
> > >>
> > >> Instead of adding further complexity to __GENMASK() to make it work
> > >> for both asm and non asm, just split the definition of the two
> > >> variants.    
> > > ...    
> > >> +#else /* defined(__ASSEMBLY__) */
> > >> +
> > >> +#define GENMASK(h, l)		__GENMASK(h, l)
> > >> +#define GENMASK_ULL(h, l)	__GENMASK_ULL(h, l)    
> > > 
> > > What do those actually expand to now?
> > > As I've said a few times both UL(0) and ULL(0) are just (0) for __ASSEMBLY__
> > > so the expansions of __GENMASK() and __GENMASK_ULL() contained the
> > > same numeric constants.    
> > 
> > Indeed, in asm, the UL(0) and ULL(0) expands to the same thing: 0.
> > 
> > But the two macros still expand to something different on 32 bits
> > architectures:
> > 
> >   * __GENMASK:
> > 
> >       (((~(0)) << (l)) & (~(0) >> (32 - 1 - (h))))
> > 
> >   * __GENMASK_ULL:
> > 
> >       (((~(0)) << (l)) & (~(0) >> (64 - 1 - (h))))
> > 
> > On 64 bits architecture these are the same.  
> 
> I've just fed those into godbolt (https://www.godbolt.org/z/Ter6WE9qE) as:
> int fi(void)
> {
>     int v;
>     asm("mov $(((~(0)) << (8)) & (~(0) >> (32 - 1 - (15)))),%0": "=r" (v));
>     return v -(((~(0u)) << (8)) & (~(0u) >> (32 - 1 - (15))));
> }
> 
> gas warns:
> <source>:4: Warning: 0xffffffffff00 shortened to 0xffffff00
> 
> unsigned long long fll(void)
> {
>     unsigned long long v;
>     asm("mov $(((~(0)) << (8)) & (~(0) >> (64 - 1 - (15)))),%0": "=r" (v));
>     return v -(((~(0ull)) << (8)) & (~(0ull) >> (64 - 1 - (15))));
> }
> 
> (for other architectures you'll need to change the opcode)
> 
> For x86 and x86-32 the assembler seems to be doing 64bit maths with unsigned
> right shifts - so the second function (with the 64 in it) generates 0xff00.
> I doubt a 32bit only assembler does 64bit maths, but the '>> 48' above
> might get masked to a '>> 16' by the cpu and generate the correct result.
> 
> So __GENMASK() is likely to be broken for any assembler that supports 64bits
> when generating 32bit code.
> __GENMASK_ULL() works (assuming all have unsigned >>) on 64bit assemblers
> (even when generating 32bit code). It may work on some 32bit assemblers.

I've remembered my 'pi' has a 32bit userspace (on a 64bit kernel).
I quick test of "mov %0,#(...)" and bits 11..8 gives the correct output
for size '32' but the error message:
/tmp/ccPB7bWh.s:26: Warning: shift count out of range (56 is not between 0 and 31)
with size '64'.

Assuming that part of the gnu assembler is consistent across architectures
you can't use either GENMASK in asm for 32bit architectures.

Any change (probably including removing the asm support for the uapi) isn't
going to make things worse!

	David

> 
> Since most uses in the header files will be GENMASK() I doubt (hope) no
> asm code actually uses the values!
> The headers assemble - but that is about all that can be said.
> 
> Bags of worms :-)
> 
> 	David
>

Vincent Mailhol March 10, 2025, 10:46 a.m. UTC | #12

On 09/03/2025 at 19:23, David Laight wrote:
> On Sun, 9 Mar 2025 01:58:53 +0000
> David Laight <david.laight.linux@gmail.com> wrote:
> 
>> On Fri, 7 Mar 2025 18:58:08 +0900
>> Vincent Mailhol <mailhol.vincent@wanadoo.fr> wrote:
>>
>>> On 07/03/2025 at 04:23, David Laight wrote:  
>>>> On Thu, 06 Mar 2025 20:29:52 +0900
>>>> Vincent Mailhol via B4 Relay <devnull+mailhol.vincent.wanadoo.fr@kernel.org> wrote:
>>>>     
>>>>> From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
>>>>>
>>>>> In an upcoming change, GENMASK() and its friends will indirectly
>>>>> depend on sizeof() which is not available in asm.
>>>>>
>>>>> Instead of adding further complexity to __GENMASK() to make it work
>>>>> for both asm and non asm, just split the definition of the two
>>>>> variants.    
>>>> ...    
>>>>> +#else /* defined(__ASSEMBLY__) */
>>>>> +
>>>>> +#define GENMASK(h, l)		__GENMASK(h, l)
>>>>> +#define GENMASK_ULL(h, l)	__GENMASK_ULL(h, l)    
>>>>
>>>> What do those actually expand to now?
>>>> As I've said a few times both UL(0) and ULL(0) are just (0) for __ASSEMBLY__
>>>> so the expansions of __GENMASK() and __GENMASK_ULL() contained the
>>>> same numeric constants.    
>>>
>>> Indeed, in asm, the UL(0) and ULL(0) expands to the same thing: 0.
>>>
>>> But the two macros still expand to something different on 32 bits
>>> architectures:
>>>
>>>   * __GENMASK:
>>>
>>>       (((~(0)) << (l)) & (~(0) >> (32 - 1 - (h))))
>>>
>>>   * __GENMASK_ULL:
>>>
>>>       (((~(0)) << (l)) & (~(0) >> (64 - 1 - (h))))
>>>
>>> On 64 bits architecture these are the same.  
>>
>> I've just fed those into godbolt (https://www.godbolt.org/z/Ter6WE9qE) as:
>> int fi(void)
>> {
>>     int v;
>>     asm("mov $(((~(0)) << (8)) & (~(0) >> (32 - 1 - (15)))),%0": "=r" (v));
>>     return v -(((~(0u)) << (8)) & (~(0u) >> (32 - 1 - (15))));
>> }
>>
>> gas warns:
>> <source>:4: Warning: 0xffffffffff00 shortened to 0xffffff00
>>
>> unsigned long long fll(void)
>> {
>>     unsigned long long v;
>>     asm("mov $(((~(0)) << (8)) & (~(0) >> (64 - 1 - (15)))),%0": "=r" (v));
>>     return v -(((~(0ull)) << (8)) & (~(0ull) >> (64 - 1 - (15))));
>> }
>>
>> (for other architectures you'll need to change the opcode)
>>
>> For x86 and x86-32 the assembler seems to be doing 64bit maths with unsigned
>> right shifts - so the second function (with the 64 in it) generates 0xff00.
>> I doubt a 32bit only assembler does 64bit maths, but the '>> 48' above
>> might get masked to a '>> 16' by the cpu and generate the correct result.
>>
>> So __GENMASK() is likely to be broken for any assembler that supports 64bits
>> when generating 32bit code.
>> __GENMASK_ULL() works (assuming all have unsigned >>) on 64bit assemblers
>> (even when generating 32bit code). It may work on some 32bit assemblers.>
> I've remembered my 'pi' has a 32bit userspace (on a 64bit kernel).
> I quick test of "mov %0,#(...)" and bits 11..8 gives the correct output
> for size '32' but the error message:
> /tmp/ccPB7bWh.s:26: Warning: shift count out of range (56 is not between 0 and 31)
> with size '64'.
> 
> Assuming that part of the gnu assembler is consistent across architectures
> you can't use either GENMASK in asm for 32bit architectures.
> 
> Any change (probably including removing the asm support for the uapi) isn't
> going to make things worse!
> 
> 	David
> 
>>
>> Since most uses in the header files will be GENMASK() I doubt (hope) no
>> asm code actually uses the values!

After reading your message, I wanted to understand where these
GENMASK*() were used in asm code. It happens that most of the
architectures simply don't use it!

I see these are using in aarch64, but that's about it.

I couldn't find any use of neither GENMASK() nor GENMASK_ULL() in x86,
arm 32 bits, m68k or powerpc asm code (I did not check the other
architectures).

aarch64 uses both the long and long long variants, but this being a 64
bits architecture, these are the same. So OK.

So, this goes into the same direction as you: I guess that the fact that
no one noticed the issue is that no one uses this on a 32 bits arch,
probably for historical reasons, i.e. the asm code was written before
the introduction of the GENMASK() macros.

>> The headers assemble - but that is about all that can be said.
>>
>> Bags of worms :-)

+1 (I will not open that bag)

I think that the asm and non asm variant should have used a different
implementation from the begining. By wanting to have a single variant
that fit both the asm and the C code, we have a complex result that is
hard to understand and maintain (c.f. the bug which you pointed out
which have been unnoticed for ever).

But now that it is in the uapi, I am not sure of what is the best
approach. I sincerely hope that we can just modify it with no userland
impact.

Well, just to make it clear, I am not touching the asm part. I
acknowledge the issue, but I do not think that I have the skill to fix it.


Yours sincerely,
Vincent Mailhol

[v5,0/7] bits: Fixed-type GENMASK()/BIT()

Message

Comments