diff mbox series

[v4,3/8] bits: introduce fixed-type genmasks

Message ID 20250305-fixed-type-genmasks-v4-3-1873dcdf6723@wanadoo.fr (mailing list archive)
State New
Headers show
Series bits: Fixed-type GENMASK()/BIT() | expand

Commit Message

Vincent Mailhol via B4 Relay March 5, 2025, 1 p.m. UTC
From: Yury Norov <yury.norov@gmail.com>

Add __GENMASK_t() which generalizes __GENMASK() to support different
types, and implement fixed-types versions of GENMASK() based on it.
The fixed-type version allows more strict checks to the min/max values
accepted, which is useful for defining registers like implemented by
i915 and xe drivers with their REG_GENMASK*() macros.

The strict checks rely on shift-count-overflow compiler check to fail
the build if a number outside of the range allowed is passed.
Example:

	#define FOO_MASK GENMASK_U32(33, 4)

will generate a warning like:

	../include/linux/bits.h:41:31: error: left shift count >= width of type [-Werror=shift-count-overflow]
	   41 |          (((t)~0ULL - ((t)(1) << (l)) + 1) & \
	      |                               ^~

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
---
Changelog:

  v3 -> v4:

    - The v3 is one year old. Meanwhile people started using
      __GENMASK() directly. So instead of generalizing __GENMASK() to
      support different types, add a new GENMASK_t().

    - replace ~0ULL by ~_ULL(0). Otherwise, __GENMASK_t() would fail
      in asm code.

    - Make GENMASK_U8() and GENMASK_U16() return an unsigned int. In
      v3, due to the integer promotion rules, these were returning a
      signed integer. By casting these to unsigned int, at least the
      signedness is kept.
---
 include/linux/bitops.h |  1 -
 include/linux/bits.h   | 33 +++++++++++++++++++++++++++++----
 2 files changed, 29 insertions(+), 5 deletions(-)

Comments

Vincent Mailhol March 5, 2025, 2:38 p.m. UTC | #1
On 05/03/2025 at 23:30, Andy Shevchenko wrote:
> On Wed, Mar 05, 2025 at 10:00:15PM +0900, Vincent Mailhol via B4 Relay wrote:
>> From: Yury Norov <yury.norov@gmail.com>
>>
>> Add __GENMASK_t() which generalizes __GENMASK() to support different
> 
> Is it with double underscore? I do not see it.

This is my mistake. In an earlier draft, it was __GENMASK_t(),
meanwhile, I dropped the __ prefix but forget to update the patch
description.

> _t is used for typedef simple types. It's unfortunate to have it
> in such a macro.

Ack.

> Perhaps T or TYPE will suffice. Or perhaps we want
> __GENMASK_Uxx() here?

If no objection, I have a preference for GENMASK_TYPE().

>> types, and implement fixed-types versions of GENMASK() based on it.
>> The fixed-type version allows more strict checks to the min/max values
>> accepted, which is useful for defining registers like implemented by
>> i915 and xe drivers with their REG_GENMASK*() macros.
>>
>> The strict checks rely on shift-count-overflow compiler check to fail
>> the build if a number outside of the range allowed is passed.
>> Example:
>>
>> 	#define FOO_MASK GENMASK_U32(33, 4)
>>
>> will generate a warning like:
>>
>> 	../include/linux/bits.h:41:31: error: left shift count >= width of type [-Werror=shift-count-overflow]
>> 	   41 |          (((t)~0ULL - ((t)(1) << (l)) + 1) & \
>> 	      |                               ^~
> 
> ...
> 
>> + * __GENMASK_U*() depends on BITS_PER_TYPE() which would not work in the asm
> 
> Where are the double underscore variants? I see it only for U128.

Same as above. The description is incorrect. I will fix this in v5.

>> + * code as BITS_PER_TYPE() relies on sizeof(), something not available in
>> + * asm. Nethertheless, the concept of fixed width integers is a C thing which
>> + * does not apply to assembly code.

Yours sincerely,
Vincent Mailhol
Yury Norov March 5, 2025, 3:47 p.m. UTC | #2
On Wed, Mar 05, 2025 at 10:00:15PM +0900, Vincent Mailhol via B4 Relay wrote:
> From: Yury Norov <yury.norov@gmail.com>
> 
> Add __GENMASK_t() which generalizes __GENMASK() to support different
> types, and implement fixed-types versions of GENMASK() based on it.
> The fixed-type version allows more strict checks to the min/max values
> accepted, which is useful for defining registers like implemented by
> i915 and xe drivers with their REG_GENMASK*() macros.
> 
> The strict checks rely on shift-count-overflow compiler check to fail
> the build if a number outside of the range allowed is passed.
> Example:
> 
> 	#define FOO_MASK GENMASK_U32(33, 4)
> 
> will generate a warning like:
> 
> 	../include/linux/bits.h:41:31: error: left shift count >= width of type [-Werror=shift-count-overflow]
> 	   41 |          (((t)~0ULL - ((t)(1) << (l)) + 1) & \
> 	      |                               ^~
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> Acked-by: Jani Nikula <jani.nikula@intel.com>
> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>

Co-developed-by?

> ---
> Changelog:
> 
>   v3 -> v4:
> 
>     - The v3 is one year old. Meanwhile people started using
>       __GENMASK() directly. So instead of generalizing __GENMASK() to
>       support different types, add a new GENMASK_t().
> 
>     - replace ~0ULL by ~_ULL(0). Otherwise, __GENMASK_t() would fail
>       in asm code.
> 
>     - Make GENMASK_U8() and GENMASK_U16() return an unsigned int. In
>       v3, due to the integer promotion rules, these were returning a
>       signed integer. By casting these to unsigned int, at least the

This comment will disappear when I'll apply the patch. Can you comment
it in the code instead?

>       signedness is kept.
> ---
>  include/linux/bitops.h |  1 -
>  include/linux/bits.h   | 33 +++++++++++++++++++++++++++++----
>  2 files changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index c1cb53cf2f0f8662ed3e324578f74330e63f935d..9be2d50da09a417966b3d11c84092bb2f4cd0bef 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -8,7 +8,6 @@
>  
>  #include <uapi/linux/kernel.h>
>  
> -#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
>  #define BITS_TO_LONGS(nr)	__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
>  #define BITS_TO_U64(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u64))
>  #define BITS_TO_U32(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
> diff --git a/include/linux/bits.h b/include/linux/bits.h
> index 5f68980a1b98d771426872c74d7b5c0f79e5e802..f202e46d2f4b7899c16d975120f3fa3ae41556ae 100644
> --- a/include/linux/bits.h
> +++ b/include/linux/bits.h
> @@ -12,6 +12,7 @@
>  #define BIT_ULL_MASK(nr)	(ULL(1) << ((nr) % BITS_PER_LONG_LONG))
>  #define BIT_ULL_WORD(nr)	((nr) / BITS_PER_LONG_LONG)
>  #define BITS_PER_BYTE		8
> +#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
>  
>  /*
>   * Create a contiguous bitmask starting at bit position @l and ending at
> @@ -25,14 +26,38 @@
>  
>  #define GENMASK_INPUT_CHECK(h, l) BUILD_BUG_ON_ZERO(const_true((l) > (h)))
>  
> -#define GENMASK(h, l) \
> -	(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
> -#define GENMASK_ULL(h, l) \
> -	(GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
> +/*
> + * Generate a mask for the specified type @t. Additional checks are made to
> + * guarantee the value returned fits in that type, relying on
> + * shift-count-overflow compiler check to detect incompatible arguments.
> + * For example, all these create build errors or warnings:
> + *
> + * - GENMASK(15, 20): wrong argument order
> + * - GENMASK(72, 15): doesn't fit unsigned long
> + * - GENMASK_U32(33, 15): doesn't fit in a u32
> + */
> +#define GENMASK_t(t, h, l)				\

Agree with Andy. This should be GENMASK_TYPE, or triple-underscored
___GENMASK() maybe. This _t thing looks misleading.

> +	(GENMASK_INPUT_CHECK(h, l) +			\
> +	 (((t)~ULL(0) - ((t)1 << (l)) + 1) &		\
> +	  ((t)~ULL(0) >> (BITS_PER_TYPE(t) - 1 - (h)))))

Can you rebase it on top of -next? In this dev cycle I merge a patch
that reverts the __GENMASK() back to:

#define __GENMASK(h, l) (((~_UL(0)) << (l)) & (~_UL(0) >> (BITS_PER_LONG - 1 - (h))))

> +#define GENMASK(h, l) GENMASK_t(unsigned long,  h, l)
> +#define GENMASK_ULL(h, l) GENMASK_t(unsigned long long, h, l)

This makes __GENMASK() and __GENMASK_ULL() unused in the kernel, other
than in uapi. Or I misunderstand it?

Having, in fact, different implementations of the same macro for kernel
and userspace is a source of problems. Can we move GENMASK_TYPE() to uapi,
and implement __GENMASK() on top of them? If not, I'd prefer to keep
GENMASK and GENMASK_ULL untouched.

Can you run bloat-o-meter and ensure there's no unwanted effects on
code generation?

>  /*
>   * Missing asm support
>   *
> + * __GENMASK_U*() depends on BITS_PER_TYPE() which would not work in the asm

And there's no __GENMASK_U*(), right?

> + * code as BITS_PER_TYPE() relies on sizeof(), something not available in
> + * asm. Nethertheless, the concept of fixed width integers is a C thing which
> + * does not apply to assembly code.
> + */
> +#define GENMASK_U8(h, l) ((unsigned int)GENMASK_t(u8,  h, l))
> +#define GENMASK_U16(h, l) ((unsigned int)GENMASK_t(u16, h, l))

Typecast to the type that user provides explicitly?  And maybe do
in GENMASK_TYPE()

> +#define GENMASK_U32(h, l) GENMASK_t(u32, h, l)
> +#define GENMASK_U64(h, l) GENMASK_t(u64, h, l)

OK, this looks good. But GENMASK_U128() becomes a special case now.
The 128-bit GENMASK is unsued, but it's exported in uapi. Is there any
simple way to end up with a common implementation for all fixed-type
GENMASKs?

> +
> +/*
>   * __GENMASK_U128() depends on _BIT128() which would not work
>   * in the asm code, as it shifts an 'unsigned __int128' data
>   * type instead of direct representation of 128 bit constants

This comment is duplicated by the previous one. Maybe just join them?
(Let's wait for a while for updates regarding GENMASK_U128 status before
doing it.)
Vincent Mailhol March 5, 2025, 4:48 p.m. UTC | #3
On 06/03/2025 at 00:47, Yury Norov wrote:
> On Wed, Mar 05, 2025 at 10:00:15PM +0900, Vincent Mailhol via B4 Relay wrote:
>> From: Yury Norov <yury.norov@gmail.com>
>>
>> Add __GENMASK_t() which generalizes __GENMASK() to support different
>> types, and implement fixed-types versions of GENMASK() based on it.
>> The fixed-type version allows more strict checks to the min/max values
>> accepted, which is useful for defining registers like implemented by
>> i915 and xe drivers with their REG_GENMASK*() macros.
>>
>> The strict checks rely on shift-count-overflow compiler check to fail
>> the build if a number outside of the range allowed is passed.
>> Example:
>>
>> 	#define FOO_MASK GENMASK_U32(33, 4)
>>
>> will generate a warning like:
>>
>> 	../include/linux/bits.h:41:31: error: left shift count >= width of type [-Werror=shift-count-overflow]
>> 	   41 |          (((t)~0ULL - ((t)(1) << (l)) + 1) & \
>> 	      |                               ^~
>>
>> Signed-off-by: Yury Norov <yury.norov@gmail.com>
>> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>> Acked-by: Jani Nikula <jani.nikula@intel.com>
>> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
> 
> Co-developed-by?

OK. I will keep you as the main author and credit me as Co-developer.

>> ---
>> Changelog:
>>
>>   v3 -> v4:
>>
>>     - The v3 is one year old. Meanwhile people started using
>>       __GENMASK() directly. So instead of generalizing __GENMASK() to
>>       support different types, add a new GENMASK_t().
>>
>>     - replace ~0ULL by ~_ULL(0). Otherwise, __GENMASK_t() would fail
>>       in asm code.
>>
>>     - Make GENMASK_U8() and GENMASK_U16() return an unsigned int. In
>>       v3, due to the integer promotion rules, these were returning a
>>       signed integer. By casting these to unsigned int, at least the
> 
> This comment will disappear when I'll apply the patch. Can you comment
> it in the code instead?

Ack. I will add below comment in the code:

  /*
   * Because of the C integer promotion rules, the U8 and the U16
   * variants would immediately become signed integers when used in
   * expressions. Cast them to unsigned int so that, at least, the
   * signedness is preserved.
   */

(unless if you prefer to go back to the u8 and u16 casts, c.f. below).

>>       signedness is kept.
>> ---
>>  include/linux/bitops.h |  1 -
>>  include/linux/bits.h   | 33 +++++++++++++++++++++++++++++----
>>  2 files changed, 29 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
>> index c1cb53cf2f0f8662ed3e324578f74330e63f935d..9be2d50da09a417966b3d11c84092bb2f4cd0bef 100644
>> --- a/include/linux/bitops.h
>> +++ b/include/linux/bitops.h
>> @@ -8,7 +8,6 @@
>>  
>>  #include <uapi/linux/kernel.h>
>>  
>> -#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
>>  #define BITS_TO_LONGS(nr)	__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
>>  #define BITS_TO_U64(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u64))
>>  #define BITS_TO_U32(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
>> diff --git a/include/linux/bits.h b/include/linux/bits.h
>> index 5f68980a1b98d771426872c74d7b5c0f79e5e802..f202e46d2f4b7899c16d975120f3fa3ae41556ae 100644
>> --- a/include/linux/bits.h
>> +++ b/include/linux/bits.h
>> @@ -12,6 +12,7 @@
>>  #define BIT_ULL_MASK(nr)	(ULL(1) << ((nr) % BITS_PER_LONG_LONG))
>>  #define BIT_ULL_WORD(nr)	((nr) / BITS_PER_LONG_LONG)
>>  #define BITS_PER_BYTE		8
>> +#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
>>  
>>  /*
>>   * Create a contiguous bitmask starting at bit position @l and ending at
>> @@ -25,14 +26,38 @@
>>  
>>  #define GENMASK_INPUT_CHECK(h, l) BUILD_BUG_ON_ZERO(const_true((l) > (h)))
>>  
>> -#define GENMASK(h, l) \
>> -	(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
>> -#define GENMASK_ULL(h, l) \
>> -	(GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
>> +/*
>> + * Generate a mask for the specified type @t. Additional checks are made to
>> + * guarantee the value returned fits in that type, relying on
>> + * shift-count-overflow compiler check to detect incompatible arguments.
>> + * For example, all these create build errors or warnings:
>> + *
>> + * - GENMASK(15, 20): wrong argument order
>> + * - GENMASK(72, 15): doesn't fit unsigned long
>> + * - GENMASK_U32(33, 15): doesn't fit in a u32
>> + */
>> +#define GENMASK_t(t, h, l)				\
> 
> Agree with Andy. This should be GENMASK_TYPE, or triple-underscored
> ___GENMASK() maybe. This _t thing looks misleading.

My preference goes to GENMASK_TYPE().

>> +	(GENMASK_INPUT_CHECK(h, l) +			\
>> +	 (((t)~ULL(0) - ((t)1 << (l)) + 1) &		\
>> +	  ((t)~ULL(0) >> (BITS_PER_TYPE(t) - 1 - (h)))))
> 
> Can you rebase it on top of -next? In this dev cycle I merge a patch
> that reverts the __GENMASK() back to:

Oh, I did not realize that. Do you mean a rebase on top of:

  https://github.com/norov/linux/tree/bitmap-for-next

?

I will do so.

> #define __GENMASK(h, l) (((~_UL(0)) << (l)) & (~_UL(0) >> (BITS_PER_LONG - 1 - (h))))
> 
>> +#define GENMASK(h, l) GENMASK_t(unsigned long,  h, l)
>> +#define GENMASK_ULL(h, l) GENMASK_t(unsigned long long, h, l)
> 
> This makes __GENMASK() and __GENMASK_ULL() unused in the kernel, other
> than in uapi. Or I misunderstand it?

Correct.

> Having, in fact, different implementations of the same macro for kernel
> and userspace is a source of problems. Can we move GENMASK_TYPE() to uapi,
> and implement __GENMASK() on top of them? If not, I'd prefer to keep
> GENMASK and GENMASK_ULL untouched.

This is something which I tried to explain in the cover letter. I am not
confident to declare GENMASK_TYPE() in the uapi and expose it to the
userland. If we do so, any future change in the parameters would be a
user breaking change. __GENMASK_U128() looks already too much to me for
the uapi, I am not keen to bloat it even more with GENMASK_TYPE().

This plus the fact that if we use GENMASK_TYPE() to generate the asm
variant, then we can not rely on sizeof() in the definition which makes
everything over complicated.

I acknowledge that not having a common denominator is not best, but I
see this as an acceptable tradeoff.

> Can you run bloat-o-meter and ensure there's no unwanted effects on
> code generation?

Ack, but that will be tomorrow :)

>>  /*
>>   * Missing asm support
>>   *
>> + * __GENMASK_U*() depends on BITS_PER_TYPE() which would not work in the asm
> 
> And there's no __GENMASK_U*(), right?

Yes, silly typo, sorry. Will fix in v5.

>> + * code as BITS_PER_TYPE() relies on sizeof(), something not available in
>> + * asm. Nethertheless, the concept of fixed width integers is a C thing which
>> + * does not apply to assembly code.
>> + */
>> +#define GENMASK_U8(h, l) ((unsigned int)GENMASK_t(u8,  h, l))
>> +#define GENMASK_U16(h, l) ((unsigned int)GENMASK_t(u16, h, l))
> 
> Typecast to the type that user provides explicitly?  And maybe do
> in GENMASK_TYPE()

I have a slight preference for the cast to unsigned int for the reason
explained above. But that is not a deal breaker. If you believe that the
u8/u16 casts are better, let me know, I will be happy to change it :)

>> +#define GENMASK_U32(h, l) GENMASK_t(u32, h, l)
>> +#define GENMASK_U64(h, l) GENMASK_t(u64, h, l)
> 
> OK, this looks good. But GENMASK_U128() becomes a special case now.
> The 128-bit GENMASK is unsued, but it's exported in uapi. Is there any
> simple way to end up with a common implementation for all fixed-type
> GENMASKs?

What bothers me is that the 128 bit types are not something available on
all architectures, c.f. the CONFIG_ARCH_SUPPORTS_INT128. So, I would
need a U128() equivalent to the ULL() but which does not break on
architectures which do not support 128 bits integers.

This is where I am stuck. If someone can guide me on how to write a
robust U128() macro, then I think the common implementation could be
feasible.

>> +
>> +/*
>>   * __GENMASK_U128() depends on _BIT128() which would not work
>>   * in the asm code, as it shifts an 'unsigned __int128' data
>>   * type instead of direct representation of 128 bit constants
> 
> This comment is duplicated by the previous one. Maybe just join them?
> (Let's wait for a while for updates regarding GENMASK_U128 status before
> doing it.)

OK. I will wait for this one. I will probably send the v5 before we get
the answer but I do not this this is an issue if we have two parallel
streams.


Yours sincerely,
Vincent Mailhol
Vincent Mailhol March 6, 2025, 9:22 a.m. UTC | #4
On 06/03/2025 at 04:45, Andy Shevchenko wrote:
>>> But GENMASK_U128() becomes a special case now.
>>> The 128-bit GENMASK is unsued, but it's exported in uapi. Is there any
>>> simple way to end up with a common implementation for all fixed-type
>>> GENMASKs?
>>
>> What bothers me is that the 128 bit types are not something available on
>> all architectures, c.f. the CONFIG_ARCH_SUPPORTS_INT128. So, I would
>> need a U128() equivalent to the ULL() but which does not break on
>> architectures which do not support 128 bits integers.
>>
>> This is where I am stuck. If someone can guide me on how to write a
>> robust U128() macro, then I think the common implementation could be
>> feasible.
> 
> I think we may leave that U128 stuff alone for now.

I found the solution! The trick is to use type_max() from overflow.h.

With this, GENMASK_TYPE() becomes:

  #define GENMASK_TYPE(t, h, l)					\
  	((t)(GENMASK_INPUT_CHECK(h, l) +			\
  	     (type_max(t) << (l) &				\
  	      type_max(t) >> (BITS_PER_TYPE(t) - 1 - (h)))))

and works with all the GENMASK variants, including the U128 one! The
unit tests under lib/test_bits.c are all green.

Of course, this does *not* work in assembly. But as explained before,
GENMASK_TYPE() is guarded by a #if !defined(__ASSEMBLY__), so all good!

The question raised by Yury on whether or not we should keep
__GENMASK_U128() in the uapi still remains. And in full honesty, I will
not touch that one. This is not in the scope of this series.


Yours sincerely,
Vincent Mailhol
diff mbox series

Patch

diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index c1cb53cf2f0f8662ed3e324578f74330e63f935d..9be2d50da09a417966b3d11c84092bb2f4cd0bef 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -8,7 +8,6 @@ 
 
 #include <uapi/linux/kernel.h>
 
-#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
 #define BITS_TO_LONGS(nr)	__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
 #define BITS_TO_U64(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u64))
 #define BITS_TO_U32(nr)		__KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
diff --git a/include/linux/bits.h b/include/linux/bits.h
index 5f68980a1b98d771426872c74d7b5c0f79e5e802..f202e46d2f4b7899c16d975120f3fa3ae41556ae 100644
--- a/include/linux/bits.h
+++ b/include/linux/bits.h
@@ -12,6 +12,7 @@ 
 #define BIT_ULL_MASK(nr)	(ULL(1) << ((nr) % BITS_PER_LONG_LONG))
 #define BIT_ULL_WORD(nr)	((nr) / BITS_PER_LONG_LONG)
 #define BITS_PER_BYTE		8
+#define BITS_PER_TYPE(type)	(sizeof(type) * BITS_PER_BYTE)
 
 /*
  * Create a contiguous bitmask starting at bit position @l and ending at
@@ -25,14 +26,38 @@ 
 
 #define GENMASK_INPUT_CHECK(h, l) BUILD_BUG_ON_ZERO(const_true((l) > (h)))
 
-#define GENMASK(h, l) \
-	(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
-#define GENMASK_ULL(h, l) \
-	(GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
+/*
+ * Generate a mask for the specified type @t. Additional checks are made to
+ * guarantee the value returned fits in that type, relying on
+ * shift-count-overflow compiler check to detect incompatible arguments.
+ * For example, all these create build errors or warnings:
+ *
+ * - GENMASK(15, 20): wrong argument order
+ * - GENMASK(72, 15): doesn't fit unsigned long
+ * - GENMASK_U32(33, 15): doesn't fit in a u32
+ */
+#define GENMASK_t(t, h, l)				\
+	(GENMASK_INPUT_CHECK(h, l) +			\
+	 (((t)~ULL(0) - ((t)1 << (l)) + 1) &		\
+	  ((t)~ULL(0) >> (BITS_PER_TYPE(t) - 1 - (h)))))
+
+#define GENMASK(h, l) GENMASK_t(unsigned long,  h, l)
+#define GENMASK_ULL(h, l) GENMASK_t(unsigned long long, h, l)
 
 /*
  * Missing asm support
  *
+ * __GENMASK_U*() depends on BITS_PER_TYPE() which would not work in the asm
+ * code as BITS_PER_TYPE() relies on sizeof(), something not available in
+ * asm. Nethertheless, the concept of fixed width integers is a C thing which
+ * does not apply to assembly code.
+ */
+#define GENMASK_U8(h, l) ((unsigned int)GENMASK_t(u8,  h, l))
+#define GENMASK_U16(h, l) ((unsigned int)GENMASK_t(u16, h, l))
+#define GENMASK_U32(h, l) GENMASK_t(u32, h, l)
+#define GENMASK_U64(h, l) GENMASK_t(u64, h, l)
+
+/*
  * __GENMASK_U128() depends on _BIT128() which would not work
  * in the asm code, as it shifts an 'unsigned __int128' data
  * type instead of direct representation of 128 bit constants