diff mbox series

[v2,1/8] minmax: Put all the clamp() definitions together

Message ID 5cd3e11780df40b0b771da5548966ebd@AcuMS.aculab.com (mailing list archive)
State New
Headers show
Series minmax: reduce compilation time | expand

Commit Message

David Laight July 28, 2024, 2:17 p.m. UTC
The defines for clamp() have got separated, move togther for readability.
Update description of signedness check.

Signed-off-by: David Laight <david.laight@aculab.com>
---
v2:
- No change.

 include/linux/minmax.h | 120 +++++++++++++++++++----------------------
 1 file changed, 56 insertions(+), 64 deletions(-)

Comments

Linus Torvalds July 28, 2024, 5:24 p.m. UTC | #1
On Sun, 28 Jul 2024 at 07:18, David Laight <David.Laight@aculab.com> wrote:
>
> +#define min_t(type, x, y)      __careful_cmp(min, (type)(x), (type)(y))
> +#define max_t(type, x, y)      __careful_cmp(max, (type)(x), (type)(y))

This is unrelated to your patch, but since it moves things around and
touches these, I reacted to it..

We should *not* use __careful_cmp() here.

Why? Because part of __careful_cmp() is the "only use arguments once".

But *another* part of __careful_cmp() is "be careful about the types"
in __cmp_once().

And being careful about the types is what causes horrendous expansion,
and is pointless when we just forced things to be the same type.

So we should split __careful_cmp() into one that does just the "do
once" and one that then also does the type checking.

But I think even if we don't do that, I wonder if we can just do this:

  #define __cmp_once(op, x, y, unique_x, unique_y) ({     \
          typeof(x) unique_x = (x);                       \
          typeof(y) unique_y = (y);                       \
          static_assert(__types_ok(x, y),                 \
          ...

and change it to

  #define __cmp_once(op, x, y, unique_x, unique_y) ({     \
          __auto_type unique_x = (x);                     \
          __auto_type unique_y = (y);                     \
          static_assert(__types_ok(unique_x, unique_y),   \
          ...

because while that may screw up the "constant integer" case (because
it now goes through that "unique_XY" variable, maybe it doesn't? At
least gcc has been known to deal with things like arguments to inline
functions well enough (ie a constant argument means that the arguments
shows as __builtin_constant_p(), and we already depend on that).

That single change would cut down on duplication of 'x' and 'y'
_enormously_. No?

(You already did the __auto_type part elsewhere)

Note that this would require the more relaxed "__is_noneg_int()" that
I suggested that allows for any expression, not just C constant
expressions)

           Linus
David Laight July 28, 2024, 6:11 p.m. UTC | #2
From: Linus Torvalds
> Sent: 28 July 2024 18:25
> 
> On Sun, 28 Jul 2024 at 07:18, David Laight <David.Laight@aculab.com> wrote:
> >
> > +#define min_t(type, x, y)      __careful_cmp(min, (type)(x), (type)(y))
> > +#define max_t(type, x, y)      __careful_cmp(max, (type)(x), (type)(y))
> 
> This is unrelated to your patch, but since it moves things around and
> touches these, I reacted to it..
> 
> We should *not* use __careful_cmp() here.
> 
> Why? Because part of __careful_cmp() is the "only use arguments once".
> 
> But *another* part of __careful_cmp() is "be careful about the types"
> in __cmp_once().
> 
> And being careful about the types is what causes horrendous expansion,
> and is pointless when we just forced things to be the same type.
> 
> So we should split __careful_cmp() into one that does just the "do
> once" and one that then also does the type checking.
...

Yes I've seen that and left well alone :-)
Or rather, left it until after MIN() and MAX() are used for constants.

Although min_t(type,x,y) should just be
	type __x = x;
	type __y = y;
	__x < __y ? __x : __y;
Absolutely no point doing anything else.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
David Laight July 28, 2024, 6:23 p.m. UTC | #3
From: Linus Torvalds
> Sent: 28 July 2024 18:25
...
> But I think even if we don't do that, I wonder if we can just do this:
> 
>   #define __cmp_once(op, x, y, unique_x, unique_y) ({     \
>           typeof(x) unique_x = (x);                       \
>           typeof(y) unique_y = (y);                       \
>           static_assert(__types_ok(x, y),                 \
>           ...
> 
> and change it to
> 
>   #define __cmp_once(op, x, y, unique_x, unique_y) ({     \
>           __auto_type unique_x = (x);                     \
>           __auto_type unique_y = (y);                     \
>           static_assert(__types_ok(unique_x, unique_y),   \
>           ...
> 
> because while that may screw up the "constant integer" case (because
> it now goes through that "unique_XY" variable, maybe it doesn't? At
> least gcc has been known to deal with things like arguments to inline
> functions well enough (ie a constant argument means that the arguments
> shows as __builtin_constant_p(), and we already depend on that).
> 
> That single change would cut down on duplication of 'x' and 'y'
> _enormously_. No?

IIRC the unique_x values can be tested with __builtin_constantp()
but will never be 'constant integer expressions' so can't be used
with static_assert() (etc).

I have thought about using typeof(unique_x) but the value 'x'.
That would be messy but only have one expansion of 'x'.
Might be doable if __COUNTER__ is passed as I did for min3().

I think it would be better to build on these changes - since they help.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Linus Torvalds July 28, 2024, 7:55 p.m. UTC | #4
On Sun, 28 Jul 2024 at 11:12, David Laight <David.Laight@aculab.com> wrote:
>
> Although min_t(type,x,y) should just be
>         type __x = x;
>         type __y = y;
>         __x < __y ? __x : __y;
> Absolutely no point doing anything else.

I tried it. Doesn't quite work:

  net/ipv4/proc.c: In function ‘snmp_seq_show_tcp_udp’:
  net/ipv4/proc.c:414:9: error: ISO C90 forbids variable length array
‘buff’ [-Werror=vla]
    414 |         unsigned long buff[TCPUDP_MIB_MAX];
        |         ^~~~~~~~

(and same issue repeated twice for IPv6).

Similar case here:

  drivers/gpu/drm/drm_color_mgmt.c: In function
‘drm_plane_create_color_properties’:
  drivers/gpu/drm/drm_color_mgmt.c:535:16: error: ISO C90 forbids
variable length array ‘enum_list’ [-Werror=vla]
    535 |         struct drm_prop_enum_list enum_list[max_t(int,
DRM_COLOR_ENCODING_MAX,
        |                ^~~~~~~~~~~~~~~~~~

and

  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c: In function
‘stmmac_dma_interrupt’:
  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:2915:9: error: ISO
C90 forbids variable length array ‘status’ [-Werror=vla]
   2915 |         int status[max_t(u32, MTL_MAX_TX_QUEUES, MTL_MAX_RX_QUEUES)];
        |         ^~~

and several cases in drivers/md/dm-integrity.c.

I guess all of these could just be made to use MIN_T()/MAX_T instead.
We're not talking hundreds of cases, it seems to be just a small
handful.

Let me go check.

               Linus
David Laight July 28, 2024, 8:09 p.m. UTC | #5
From: Linus Torvalds <torvalds@linuxfoundation.org>
> Sent: 28 July 2024 20:55
> 
> On Sun, 28 Jul 2024 at 11:12, David Laight <David.Laight@aculab.com> wrote:
> >
> > Although min_t(type,x,y) should just be
> >         type __x = x;
> >         type __y = y;
> >         __x < __y ? __x : __y;
> > Absolutely no point doing anything else.
> 
> I tried it. Doesn't quite work:
> 
>   net/ipv4/proc.c: In function ‘snmp_seq_show_tcp_udp’:
>   net/ipv4/proc.c:414:9: error: ISO C90 forbids variable length array
> ‘buff’ [-Werror=vla]
>     414 |         unsigned long buff[TCPUDP_MIB_MAX];
>         |         ^~~~~~~~
...

Ah, the constants.
I think they just need to be MIN_CONST() (without the casts).
One might need a cast of one of its constants.
But maybe the signedness test can be ignored if there is a
test for it being a constant.

But (as you said earlier in the year) that should just be MIN().
Except there are a few places that is used that need changing first.

	David

> 
> I guess all of these could just be made to use MIN_T()/MAX_T instead.
> We're not talking hundreds of cases, it seems to be just a small
> handful.
> 
> Let me go check.
> 
>                Linus

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Linus Torvalds July 28, 2024, 8:13 p.m. UTC | #6
On Sun, 28 Jul 2024 at 13:10, David Laight <David.Laight@aculab.com> wrote:
>
> I think they just need to be MIN_CONST() (without the casts).

I'll just convert the existing cases of min_t/max_t to MIN_T/MAX_T,
which I already added for other reasons anyway.

That makes min_t/max_t not have to care about the nasty special cases
(really just array sizes in these cases, and they all wanted MAX_T).

> But (as you said earlier in the year) that should just be MIN().
> Except there are a few places that is used that need changing first.

Yeah, we should just do this for MIN/MAX too, but that's slightly more
annoying because those names are in use and defined by several
drivers.

Which makes it more of a clean-up.

But I'm inclined to just do that too. Bite the bullet, get rid of the
whole "has to be a C constant expression" pain.

          Linus
David Laight July 28, 2024, 8:22 p.m. UTC | #7
..
> But I'm inclined to just do that too. Bite the bullet, get rid of the
> whole "has to be a C constant expression" pain.

At least you can 'just do it' :-)

IIRC some of the MIN() need to be min() and others are used for
constants so would need an initial #ifndef MIN test to maintain
bisection without having a single patch that changes all the
files at once.

MIN() (and probably your MIN_T()) ought to have a check for
being a constant in order to stop misuse.
Perhaps just:
#define MIN(x,y) \
	(__builtin_choose_expr((x)+(y), 0, 0) + ((x) < (y} ? (x) : (y)))

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Linus Torvalds July 28, 2024, 8:31 p.m. UTC | #8
On Sun, 28 Jul 2024 at 13:23, David Laight <David.Laight@aculab.com> wrote:
>
> MIN() (and probably your MIN_T()) ought to have a check for
> being a constant in order to stop misuse.

No, we have a number of "runtime constants" that are basically
"constants" set up at boot-time for the architecture,as pointed out by
the powerpc people in private:

Ie, we have arch/powerpc/include/asm/page.h:

   #define HPAGE_SHIFT hpage_shift

and then

  #define HUGETLB_PAGE_ORDER      (HPAGE_SHIFT - PAGE_SHIFT)

and then

   #define pageblock_order         MIN_T(unsigned int,
HUGETLB_PAGE_ORDER, MAX_PAGE_ORDER)

and we really *REALLY* don't want to force the complicated "min_t()"
(or, worse yet, "min()") functions here just because there's actually
a variable involved.

That variable gets initialized early in
hugetlbpage_init_defaultsize(), so it's *effectively* a constant, but
not as far as the compiler is concerned.

           Linus
Linus Torvalds July 28, 2024, 9:01 p.m. UTC | #9
On Sun, 28 Jul 2024 at 13:23, David Laight <David.Laight@aculab.com> wrote:
>
> At least you can 'just do it' :-)

I decided to use my powers for good. Or at least goodish.

I went through a lot of 'min_t()' and 'max_t()' users, and I think I
found them all. There's a possibility that some driver that I can't
easily build-test has issues, but I tried to manually check all the
architecture ones, and did an allmodconfig build on arm64 and x86-64.

And by visual inspection I found a 32-bit x86 PAE case. Which makes me
think my visual inspection was not entirely broken.

Anyway, I don't love the timing, since I'm going to cut 6.11-rc1 asap,
but I also don't want to unnecessarily leave this pending for later,
so I just committed the simplifications for min_t/max_t.

Doing the same for min/max (no more C constant expression worries!)
would be very good, but I think that's going to be for later.

                Linus
David Laight July 28, 2024, 9:53 p.m. UTC | #10
From: Linus Torvalds
> Sent: 28 July 2024 22:01
> 
> On Sun, 28 Jul 2024 at 13:23, David Laight <David.Laight@aculab.com> wrote:
> >
> > At least you can 'just do it' :-)
> 
> I decided to use my powers for good. Or at least goodish.
> 
> I went through a lot of 'min_t()' and 'max_t()' users, and I think I
> found them all. There's a possibility that some driver that I can't
> easily build-test has issues, but I tried to manually check all the
> architecture ones, and did an allmodconfig build on arm64 and x86-64.
> 
> And by visual inspection I found a 32-bit x86 PAE case. Which makes me
> think my visual inspection was not entirely broken.

I've been through a lot of them in the past.
About 95% could now just be min/max.
The 'fun' ones are where the cast reduces the size of one type.
One of those got fixed because it was a real bug.
But there are some strange (u8) ones in some terminal window size
code which manage to convert 0 to -1 then ~0u and finally to the
max size - which might even be desirable!

> Anyway, I don't love the timing, since I'm going to cut 6.11-rc1 asap,
> but I also don't want to unnecessarily leave this pending for later,
> so I just committed the simplifications for min_t/max_t.

Better than just before -rc6!
At least these changes tend to generate build errors.

	David

> Doing the same for min/max (no more C constant expression worries!)
> would be very good, but I think that's going to be for later.
> 
>                 Linus

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
David Laight July 28, 2024, 10:13 p.m. UTC | #11
From: Linus Torvalds <torvalds@linuxfoundation.org>
> Sent: 28 July 2024 21:32
> 
> On Sun, 28 Jul 2024 at 13:23, David Laight <David.Laight@aculab.com> wrote:
> >
> > MIN() (and probably your MIN_T()) ought to have a check for
> > being a constant in order to stop misuse.
> 
> No, we have a number of "runtime constants" that are basically
> "constants" set up at boot-time for the architecture,as pointed out by
> the powerpc people in private:
> 
> Ie, we have arch/powerpc/include/asm/page.h:
> 
>    #define HPAGE_SHIFT hpage_shift
> 
> and then
> 
>   #define HUGETLB_PAGE_ORDER      (HPAGE_SHIFT - PAGE_SHIFT)
> 
> and then
> 
>    #define pageblock_order         MIN_T(unsigned int,
> HUGETLB_PAGE_ORDER, MAX_PAGE_ORDER)
> 
> and we really *REALLY* don't want to force the complicated "min_t()"
> (or, worse yet, "min()") functions here just because there's actually
> a variable involved.
> 
> That variable gets initialized early in
> hugetlbpage_init_defaultsize(), so it's *effectively* a constant, but
> not as far as the compiler is concerned.

Ok, but those can't be used as array sizes or constants.
So the temporaries don't matter.
Don't they just work with min() - if not where is the signednes mismatch?

Perhaps we want:
	min() - temporaries, signedness check.
	min_t() - temporaries of given type, maybe check size not reduced?
	MIN() - no temporaries, no signedness check, only valid for constants.
	_min() - temporaries, no signedness check.
	_MIN() - no temporaries, no signedness check, variables allowed.

I'm not sure where your MIN_T() fits in the above.

Personally I think min_t() was a mistake.
Only one input can need a cast and an explicit cast would be safer.

	David

> 
>            Linus

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Linus Torvalds July 28, 2024, 10:22 p.m. UTC | #12
On Sun, 28 Jul 2024 at 15:14, David Laight <David.Laight@aculab.com> wrote:
>
> Ok, but those can't be used as array sizes or constants.
> So the temporaries don't matter.

No, mut I don't want the insane size explosion from unnecessarily just
forcing it to use min()/max().

> Don't they just work with min() - if not where is the signednes mismatch?

David - this whole discussion is BECAUSE THESE THINGS ARE A TOTAL
DISASTER WHEN USED IN DEEP MACRO EXPANSION.

So no. It does not work - because core macros like HUGETLB_PAGE_ORDER
end up being used deep in the VM layer, and I don't want to see
another stupid multi-ten-kB line just because min() is such a pig.

End result: I'm going to make the rule be that when you do macro
definitions using constants, then "MIN()/MAX()" is preferable simply
because it avoids the insane expansion noise.

Then in normal *code* you should use min() and max(). But not for
things like macro "constants" even if those constants end up being
some computed thing.

          Linus
Linus Torvalds July 29, 2024, 4:15 a.m. UTC | #13
On Sun, 28 Jul 2024 at 14:01, Linus Torvalds
<torvalds@linuxfoundation.org> wrote:
>
> Doing the same for min/max (no more C constant expression worries!)
> would be very good, but I think that's going to be for later.

Bah. It's like picking at a scab. Later is now. It's out.

I hate how that is_signed_type() isn't a C constant expression for
pointer types. The simplifications get rid of a lot of crap, but sadly
the pointer-induced stuff is still problematic.

Oh well. For that one file, even the partial simplification ended up
shrinking the expansion from 23MB to 1.4MB. It's admittedly a pretty
abnormal case, but still..

           Linus
David Laight July 29, 2024, 8:01 a.m. UTC | #14
From: Linus Torvalds
> Sent: 28 July 2024 23:23
> 
> On Sun, 28 Jul 2024 at 15:14, David Laight <David.Laight@aculab.com> wrote:
> >
> > Ok, but those can't be used as array sizes or constants.
> > So the temporaries don't matter.
> 
> No, mut I don't want the insane size explosion from unnecessarily just
> forcing it to use min()/max().
> 
> > Don't they just work with min() - if not where is the signednes mismatch?
> 
> David - this whole discussion is BECAUSE THESE THINGS ARE A TOTAL
> DISASTER WHEN USED IN DEEP MACRO EXPANSION.
> 
> So no. It does not work - because core macros like HUGETLB_PAGE_ORDER
> end up being used deep in the VM layer, and I don't want to see
> another stupid multi-ten-kB line just because min() is such a pig.
> 
> End result: I'm going to make the rule be that when you do macro
> definitions using constants, then "MIN()/MAX()" is preferable simply
> because it avoids the insane expansion noise.

I think you still need the temporaries if values aren't constant.
And you really don't want the casts unless you actually need them
to do something 'useful' - unlikely especially since negative
values are unusual.

Now you may want to avoid the explosive nature of min(), but if MIN()
(or MIN_T) evaluates its arguments twice someone will use it in the
wrong place.

	David

> 
> Then in normal *code* you should use min() and max(). But not for
> things like macro "constants" even if those constants end up being
> some computed thing.
> 
>           Linus

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Arnd Bergmann July 29, 2024, 10:25 p.m. UTC | #15
On Sun, Jul 28, 2024, at 22:13, Linus Torvalds wrote:
> On Sun, 28 Jul 2024 at 13:10, David Laight <David.Laight@aculab.com> wrote:
>>
>> I think they just need to be MIN_CONST() (without the casts).
>
> I'll just convert the existing cases of min_t/max_t to MIN_T/MAX_T,
> which I already added for other reasons anyway.
>
> That makes min_t/max_t not have to care about the nasty special cases
> (really just array sizes in these cases, and they all wanted MAX_T).

I had prototyped something similar end of last week but didn't manage
to get my version out to you before the weekend. Comparing mine with
what you ended up committing:

- You found exactly the same array index uses I found in
  randconfig testing, so I'm not aware of anything missing
  there.

- My macros use __builtin_choose_expr() instead of ?: to
  ensure that the arguments are constant, this produces a
  relatively clear compiler warning when they are not.
  Without that, I would expect random drivers to start
  using MIN()/MAX() in places where it's not safe.

- I went with the belts-and-suspenders version of MIN()/MAX()
  that works when comparing a negative constant against
  an unsigned one. This requires expanding each argument
  four or five times instead of two, so you might still
  want the simpler version (like MIN_T/MAX_T):

--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -295,12 +271,18 @@ static inline bool in_range32(u32 val, u32 start, u32 len)
        do { typeof(a) __tmp = (a); (a) = (b); (b) = __tmp; } while (0)
 
 /*
- * Use these carefully: no type checking, and uses the arguments
- * multiple times. Use for obvious constants only.
+ * These only work on constant values but return a constant value that
+ * can be used as an array size
  */
-#define MIN(a,b) __cmp(min,a,b)
-#define MAX(a,b) __cmp(max,a,b)
-#define MIN_T(type,a,b) __cmp(min,(type)(a),(type)(b))
-#define MAX_T(type,a,b) __cmp(max,(type)(a),(type)(b))
+#define MIN(x, y) \
+   __builtin_choose_expr(((x) < 0 && (y) > 0), (x), \
+   __builtin_choose_expr((((y) < 0 && (x) > 0) || (y) < (x)), (y), (x)))
+
+#define MAX(x, y) \
+   __builtin_choose_expr(((x) > 0 && (y) < 0), (x), \
+   __builtin_choose_expr((((y) > 0 && (x) < 0) || (y) > (x)), (y), (x)))
+
+#define MIN_T(type,a,b) (type)__builtin_choose_expr((type)(a) < (type)(b), (a), (b))
+#define MAX_T(type,a,b) (type)__builtin_choose_expr((type)(a) > (type)(b), (a), (b))
 
 #endif /* _LINUX_MINMAX_H */

- The change above requires changing a number of files that were
  previously using their own MIN()/MAX() macros over to using
  min()/max(), as they are passing non-constant values in:

 drivers/gpu/drm/amd/display/dc/core/dc_stream.c             | 12 ++++--------
 .../drm/amd/display/dc/dio/dcn20/dcn20_link_encoder.c       |  9 +--------
 .../drm/amd/display/dc/dio/dcn31/dcn31_dio_link_encoder.c   |  8 ++------
 .../drm/amd/display/dc/dio/dcn32/dcn32_dio_link_encoder.c   |  6 +-----
 .../drm/amd/display/dc/dio/dcn321/dcn321_dio_link_encoder.c |  4 ----
 .../drm/amd/display/dc/dio/dcn401/dcn401_dio_link_encoder.c |  8 --------
 .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c                | 13 +++----------
 .../drm/amd/display/dc/dsc/dc_dsc.c                         |  9 +--------
 .../drm/amd/display/dc/link/protocols/link_dp_capability.c  | 13 +++----------
 drivers/gpu/drm/amd/display/modules/hdcp/hdcp_ddc.c         | 11 ++++-------
 drivers/gpu/drm/radeon/evergreen_cs.c                       |  9 ++-------
 drivers/platform/x86/sony-laptop.c                          |  4 ++--
 kernel/trace/preemptirq_delay_test.c                        |  2 +-
 lib/decompress_unlzma.c                                     |  7 ++-----
 14 files changed, 26 insertions(+), 89 deletions(-)

  Changing these is probably a good idea regardless.

- I also tried simplifying __types_ok()  further, which as
  you already  mentioned doesn't easily work with pointer
  arguments. Again we could work around this with a separate
  min_ptr()/max_ptr() helper. I only found 11 files that
  actually compare pointers (on x86/arm/arm64 randconfig):  

 arch/arm64/kvm/hyp/nvhe/page_alloc.c  | 2 +-
 crypto/skcipher.c                     | 2 +-
 drivers/gpu/drm/drm_modes.c           | 2 +-
 drivers/infiniband/hw/hfi1/pio_copy.c | 4 ++--
 drivers/irqchip/irq-bcm7120-l2.c      | 2 +-
 drivers/spi/spi-cs42l43.c             | 8 ++++----
 fs/ntfs3/lznt.c                       | 2 +-
 lib/lzo/lzo1x_compress.c              | 2 +-
 mm/kmemleak.c                         | 4 ++--
 mm/percpu.c                           | 2 +-
 net/ceph/osdmap.c                     | 6 +++---
 11 files changed, 25 insertions(+), 18 deletions(-)

 The simpler __types_ok() needs more testing across all
 compiler versions, so that wouldn't be for 6.11 anyway.
 I can send the min_ptr()/max_ptr() stuff anyway if
 you think that's a good idea.  

       Arnd
Linus Torvalds July 29, 2024, 11:21 p.m. UTC | #16
On Mon, 29 Jul 2024 at 15:25, Arnd Bergmann <arnd@kernel.org> wrote:
>
> - My macros use __builtin_choose_expr() instead of ?: to
>   ensure that the arguments are constant, this produces a
>   relatively clear compiler warning when they are not.
>   Without that, I would expect random drivers to start
>   using MIN()/MAX() in places where it's not safe.

Hmm. We have known non-constant uses, which Stephen Rothwell pointed
out elsewhere, with PowerPC having things like this:

  #define PAGE_SHIFT              CONFIG_PAGE_SHIFT
  extern unsigned int hpage_shift;
  #define HPAGE_SHIFT hpage_shift
  #define HUGETLB_PAGE_ORDER      (HPAGE_SHIFT - PAGE_SHIFT)

and honestly, considering the absolutely *horrid* mess that is the
current min/max expansion, I really think that using MIN/MAX for these
kinds of expressions is the right thing to do.

I don't worry about architecture code that has a boot-time
pseudo-constant for some inherent property.  I worry about random
drivers doing crazy things.

But it is *not* a constant, and any MIN/MAX that disallows it is imho
not a good MIN/MAX.

What we actually care about is not "constant", but "no side effects".

And obviously the typing, but that ends up being really hard.

We could possibly require that the typing be extra strict for MIN/MAX,
using the old __typecheck(x, y) macro we used to use. That literally
requires the *same* types, but that then ends up being pointlessly
painful for the "actual constant" cases, because dammit, a regular
non-negative integer constant should compare with any type.

It's a real pain that compiler writers can't just get the simple cases
right and give us a sane "__builtin_ordered_ok()" kind of thing.
Instead they push things like the absolute unusable garbage that is
-Wsign-compare.

Anyway, I also completely gave up on another absolute standard garbage
thing: _static_assert(). What a horrendous piece of sh*t that is.
Replacing our use of that with our own BUILD_BUG_ON_MSG() made a lot
of things much much simpler.

Attached is the patch I have in my tree right now - it complains about
a 'bcachefs' comparison between an 'u16' and a 's64', because I also
removed the 'implicit integer promotion is ok' logic, because I think
it's wrong.

I don't think a min(u16,s64) is a valid minimum, for exactly the same
reason a min(u32,s64) is not valid.

Despite the C integer expression promotion rules.

             Linus
Linus Torvalds July 30, 2024, 1:52 a.m. UTC | #17
On Mon, 29 Jul 2024 at 16:21, Linus Torvalds
<torvalds@linuxfoundation.org> wrote:
>
> What we actually care about is not "constant", but "no side effects".

Ho humm..

We actually could do something like this:

   #define no_side_effects(x) __builtin_constant_p((x,0))

because __builtin_constant_p() doesn't actually look at just the value
of the expression, but is defined to return 0 even if the value is
constant but the expression has side effects.

So no_side_effects(variable) returns true, but
no_side_effects(variable++) returns false.

Note that this is also why _static_assert() and
__builtin_choose_expr() are generally very dubiously useful. Because
they are limited to a "C constant expression", they fundamentally
cannot take advantage of trivial optimization things, and fall flat on
their face in many real life situations.

In contrast, __builtin_constant_p() works for arbitrary expressions,
and just says "I can easily turn this expression into a constant".

Which makes it much more powerful than the nasty syntactic "C constant
expression" thing that is very limited.

Things like __builtin_choose_expr() are ok with very simple and direct
conditionals like "is this a double", and they have their place, but
in something like min() that by definition takes many different types
- including pointer types - it's been a huge pain.

              Linus
Linus Torvalds July 30, 2024, 3:59 a.m. UTC | #18
On Mon, 29 Jul 2024 at 16:21, Linus Torvalds
<torvalds@linuxfoundation.org> wrote:
>
> Attached is the patch I have in my tree right now - it complains about
> a 'bcachefs' comparison between an 'u16' and a 's64', because I also
> removed the 'implicit integer promotion is ok' logic, because I think
> it's wrong.
>
> I don't think a min(u16,s64) is a valid minimum, for exactly the same
> reason a min(u32,s64) is not valid.

Oh, and I noticed that it screws up the 32-bit case, and that does
need a workaround for that.

So here's a better version. The patch contains one possible fix to
bcachefs for the type confusion there, but I'll wait for Kent to
respond on that.

             Linus
Arnd Bergmann July 30, 2024, 10:10 a.m. UTC | #19
On Tue, Jul 30, 2024, at 05:59, Linus Torvalds wrote:
> On Mon, 29 Jul 2024 at 16:21, Linus Torvalds <torvalds@linuxfoundation.org> wrote:
>>
>> Attached is the patch I have in my tree right now - it complains about
>> a 'bcachefs' comparison between an 'u16' and a 's64', because I also
>> removed the 'implicit integer promotion is ok' logic, because I think
>> it's wrong.

I'm giving this a spin on the randconfig test setup now to see
if there are some other cases like the bcachefs one. So far I've
seen one failure, but I can't make sense of it yet:

drivers/gpu/drm/i915/display/intel_backlight.c: In function 'scale':
include/linux/compiler_types.h:510:45: error: call to '__compiletime_assert_905' declared with attribute error: clamp() low limit source_min greater than high limit source_max
include/linux/minmax.h:107:9: note: in expansion of macro 'BUILD_BUG_ON_MSG'
  107 |         BUILD_BUG_ON_MSG(statically_true(ulo > uhi),                            \
drivers/gpu/drm/i915/display/intel_backlight.c:47:22: note: in expansion of macro 'clamp'
   47 |         source_val = clamp(source_val, source_min, source_max);

See https://pastebin.com/raw/yLJ5ZqVw for the x86-64 .config
that triggered this.

>> I don't think a min(u16,s64) is a valid minimum, for exactly the same
>> reason a min(u32,s64) is not valid.
>
> Oh, and I noticed that it screws up the 32-bit case, and that does
> need a workaround for that.
>
> So here's a better version. The patch contains one possible fix to
> bcachefs for the type confusion there, but I'll wait for Kent to
> respond on that.

That's still a typo in the 32-bit case, right?
I've changed

 __builtin_choose_expr(sizeof(ux)>32,1LL,1L))

to check for sizeof(ux)>4 for my testing.

    Arnd
David Laight July 30, 2024, 12:03 p.m. UTC | #20
From: Arnd Bergmann
> Sent: 29 July 2024 23:25
..
> - My macros use __builtin_choose_expr() instead of ?: to
>   ensure that the arguments are constant, this produces a
>   relatively clear compiler warning when they are not.
>   Without that, I would expect random drivers to start
>   using MIN()/MAX() in places where it's not safe.

Yes, I think you either want temporaries or a constant check.
Losing the signedness check is less of a problem.
Since 'constantness' is only important for array sizes and
static initialisers the constant check is unlikely to be a problem.

Just adding __builtin_choose_expr((x)+(y),0,0) to the result
is enough and quite simple.
Then each argument is expanded 3 times.
(cf once for temporaries without a signedness check.))

> 
> - I went with the belts-and-suspenders version of MIN()/MAX()
>   that works when comparing a negative constant against
>   an unsigned one. This requires expanding each argument
>   four or five times instead of two, so you might still
>   want the simpler version (like MIN_T/MAX_T):

I'm not sure that is worth it, there are very few negative
values (except error values) in the entire kernel.
And where ever the value is used a carefully created negative
constant is likely to be promoted to unsigned.

I'm not sure I like the casts in MIN_T() and MAX_T() either.
As with min_t() someone will use too small a type.

OTOH umin(x, y) could now be defined as:
	min_t(u64, (x) + 0 + 0u, (y) + 0 + 0u)
(which will never sign extend).
Then some of the bloating min() that are known to generate
unsigned values can be changed to umin().

It is also worth noting that umin() is likely to generate
better code than an _min() (that doesn't have the type check)
because comparing a 32bit signed type (assumed positive) with
a 64bit value won't require the (pointless) sign extension
code (particularly nasty on 32bit).
In my tests the compiler optimised for the high 32bits being
zero (from the zero extend) - so extending to 64bits doesn't matter.

The typeof((x) + 0) in the signedness test (a different email)
that matches the C promotion of u8 and u16 might have been added
before I did the change to allow unsigned comparisons against
positive integer constants.

Another anti-bloat is to do all the type tests on the temporaries.
So something based on:
#define ok_unsigned(_x, x)
	(is_unsigned_type((typeof(_x)) ? 1 : if_constexpr((x), (x) >= 0, 0))) 

#define min(x, y)
	__auto_type _x = x;
	__auto_type _y = y;
	if (!is_signed_type(typeof(_x + _y))
		&& !(ok_unsigned(_x, x) && ok_unsigned(_y, y)))
	   error();
	__x < _y ? _x : _y;
which only has 3 expansions of each term.

(Of course the 'uniq' needed for nested calls don't help.)

	David

	

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Arnd Bergmann July 30, 2024, 2:14 p.m. UTC | #21
On Tue, Jul 30, 2024, at 12:10, Arnd Bergmann wrote:
> On Tue, Jul 30, 2024, at 05:59, Linus Torvalds wrote:
>> On Mon, 29 Jul 2024 at 16:21, Linus Torvalds <torvalds@linuxfoundation.org> wrote:
>
> I'm giving this a spin on the randconfig test setup now to see
> if there are some other cases like the bcachefs one. So far I've
> seen one failure, but I can't make sense of it yet:
>
> drivers/gpu/drm/i915/display/intel_backlight.c: In function 'scale':
> include/linux/compiler_types.h:510:45: error: call to 
> '__compiletime_assert_905' declared with attribute error: clamp() low 
> limit source_min greater than high limit source_max
> include/linux/minmax.h:107:9: note: in expansion of macro 
> 'BUILD_BUG_ON_MSG'
>   107 |         BUILD_BUG_ON_MSG(statically_true(ulo > uhi),            
>                 \
> drivers/gpu/drm/i915/display/intel_backlight.c:47:22: note: in 
> expansion of macro 'clamp'
>    47 |         source_val = clamp(source_val, source_min, source_max);
>
> See https://pastebin.com/raw/yLJ5ZqVw for the x86-64 .config
> that triggered this.

The above seems to happen only with gcc-13 and gcc-14, but not gcc-12
and earlier, and it's the only one I've seen with a bit of randconfig
testing on that version.

There is another one that I see with gcc-8 randconfigs (arm64):

net/netfilter/ipvs/ip_vs_conn.c: In function 'ip_vs_conn_init':
include/linux/compiler_types.h:510:38: error: call to '__compiletime_assert_1040' declared with attribute error: clamp() low limit min greater than high limit max_avail
  510 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
      |                                      ^
include/linux/minmax.h:182:28: note: in expansion of macro '__careful_clamp'
  182 | #define clamp(val, lo, hi) __careful_clamp(val, lo, hi)
      |                            ^~~~~~~~~~~~~~~
net/netfilter/ipvs/ip_vs_conn.c:1498:8: note: in expansion of macro 'clamp'
 1498 |  max = clamp(max, min, max_avail);

I can reproduce this one with gcc-8/9/10, but not gcc-11
or higher.

This may be another case of __builtin_constant_p() being
slightly unreliable when a local variable is constant-folded
based on a condition, or with partial inlining.

     Arnd
Linus Torvalds July 30, 2024, 4:35 p.m. UTC | #22
On Tue, 30 Jul 2024 at 03:11, Arnd Bergmann <arnd@kernel.org> wrote:
>
> I'm giving this a spin on the randconfig test setup now to see
> if there are some other cases like the bcachefs one. So far I've
> seen one failure, but I can't make sense of it yet:

So the new checks are actually a lot smarter, since unlike the old
ones they don't require a C constant expression, and will find cases
where the compiler can see expressions that turn out statically
optimizable.

This is a great example of that, although "great" in this case is
sadly not what we want:

> drivers/gpu/drm/i915/display/intel_backlight.c: In function 'scale':
> include/linux/compiler_types.h:510:45: error: call to '__compiletime_assert_905' declared with attribute error: clamp() low limit source_min greater than high limit source_max
> include/linux/minmax.h:107:9: note: in expansion of macro 'BUILD_BUG_ON_MSG'
>   107 |         BUILD_BUG_ON_MSG(statically_true(ulo > uhi),                            \
> drivers/gpu/drm/i915/display/intel_backlight.c:47:22: note: in expansion of macro 'clamp'
>    47 |         source_val = clamp(source_val, source_min, source_max);

So here *locally*, source_min and source_max can't be ordered, but
what I think has happened is that we had that earlier

        WARN_ON(source_min > source_max);

and then gcc sees the "statically_true(ulo > uhi)" test, and will do
CSE on the variables and on the test condition and the conditional,
and basically have turned all of this into

        if (source_min > source_max) {
                WARN(..)
                source_val = clamp(source_val, source_min, source_max);
        } else {
                source_val = clamp(source_val, source_min, source_max);
        }

and now the case with the WARN() will statically obviously be bad.

I don't see the failure, so it clearly depends on some config default,
and I suspect with my allmodconfig build, for example, there is so
much else going on that gcc won't have done the above trivial
conversion.

The old code never saw any of this, because the old code was using the
terminally stupid _static_assert(), and within the much more limited
scope of a "C constant expression", that "source_min < source_max"
could never be true, even if there are code paths where it *is* true.

But here I think we were bitten by excessive cleverness.

> That's still a typo in the 32-bit case, right?
> I've changed
>
>  __builtin_choose_expr(sizeof(ux)>32,1LL,1L))
>
> to check for sizeof(ux)>4 for my testing.

Bah yes. I had that fix locally, and sent the old patch.

            Linus
Linus Torvalds July 30, 2024, 4:46 p.m. UTC | #23
On Tue, 30 Jul 2024 at 09:35, Linus Torvalds
<torvalds@linuxfoundation.org> wrote:
>
> So here *locally*, source_min and source_max can't be ordered, but
> what I think has happened is that we had that earlier
>
>         WARN_ON(source_min > source_max);
>
> and then gcc sees the "statically_true(ulo > uhi)" test, and will do
> CSE on the variables and on the test condition and the conditional,
> and basically have turned all of this into
>
>         if (source_min > source_max) {
>                 WARN(..)
>                 source_val = clamp(source_val, source_min, source_max);
>         } else {
>                 source_val = clamp(source_val, source_min, source_max);
>         }

Confirmed with your .config - removing the WARN_ON() removes the
clamping range error, because then there is no "move code into shared
conditional section" case any more.

That's slightly annoying. The new clamp() logic is not only a much
cleaner macro expansion, it's also *much* smarter and would find real
problems when the limits have been passed as arguments to inline
functions etc.

But obviously this "it's statically wrong in one path when the code
has been duplicated by the compiler" means that it's just too smart
for its own good in this case.

If the WARN_ON() had a "return error", it would all work out
beautifully. But now we literally have code that says "I just tested
for this error condition, and then I went ahead and did the error case
anyway", and that just makes my nice new sanity check unhappy.

Darn.

                Linus
Linus Torvalds July 30, 2024, 6:02 p.m. UTC | #24
On Tue, 30 Jul 2024 at 07:15, Arnd Bergmann <arnd@kernel.org> wrote:
>
> There is another one that I see with gcc-8 randconfigs (arm64):

So I ended up undoing that part of my patch, so it's a non-issue, but
I wanted to figure out what went wrong.

It's actually quite funny.

> net/netfilter/ipvs/ip_vs_conn.c:1498:8: note: in expansion of macro 'clamp'
>  1498 |  max = clamp(max, min, max_avail);

So it turns out that min is seen by the compiler to be constant (8).
And max_avail() uses order_base_2(), which expands to this huge
comparison table for the constant case.

Now, in the end all of those comparisons will go away, but I think
what happens is that because they exist at the source level, the
compiler ends up expanding them, and then - for exactly the same
reason as before - the compiler can find a path in that tree of
conditionals where "max_avail" does indeed end up being a constant
smaller than 8.

The fact that that path is then never taken, and pruned away later,
doesn't help us, and the compiletime_assert happens because in one
intermediate form the compiler had folded the now trivial 'clamp()' of
two constants in with the conditionals that get thrown away, and the
warning happens even when it's not reachable.

Anyway, it does mean that yeah, the much more stupid static_assert()
that will never try to follow any chain of expressions si the way to
go.

For the type checking, this isn't as much of an issue.

The whole "is that a non-negative expression" also uses the same
machinery, but if the expression could be negative in some situation
(that goes away in the end) at worst it just means that the new rules
won't be as relaxed as they could be when the expressions get
sufficiently complicated.

                Linus
Linus Torvalds July 30, 2024, 7:52 p.m. UTC | #25
On Tue, 30 Jul 2024 at 11:02, Linus Torvalds
<torvalds@linuxfoundation.org> wrote:
>
> On Tue, 30 Jul 2024 at 07:15, Arnd Bergmann <arnd@kernel.org> wrote:
> >
> > There is another one that I see with gcc-8 randconfigs (arm64):
>
> So I ended up undoing that part of my patch, so it's a non-issue [..]

I pushed out my current one.

It keeps the old semantics wrt the clamp() static_assert, and it
obviously has the "allow small unsigned types to promote to 'int'"
that I already did earlier.

I still suspect we shouldn't do that relaxed integer promotion rule,
but it's what we used to do, and it's easy to get rid of if we decide
to, and it's a separate issue from the whole "make minmax expansion
more reasonable".

              Linus
David Laight July 30, 2024, 9:47 p.m. UTC | #26
From: Linus Torvalds
> Sent: 30 July 2024 20:53
> To: Arnd Bergmann <arnd@kernel.org>
> Cc: David Laight <David.Laight@ACULAB.COM>; linux-kernel@vger.kernel.org; Jens Axboe
> <axboe@kernel.dk>; Matthew Wilcox <willy@infradead.org>; Christoph Hellwig <hch@infradead.org>; Andrew
> Morton <akpm@linux-foundation.org>; Andy Shevchenko <andriy.shevchenko@linux.intel.com>; Dan Carpenter
> <dan.carpenter@linaro.org>; Jason A . Donenfeld <Jason@zx2c4.com>; pedro.falcato@gmail.com; Mateusz
> Guzik <mjguzik@gmail.com>; linux-mm@kvack.org; Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Subject: Re: [PATCH v2 1/8] minmax: Put all the clamp() definitions together
> 
> On Tue, 30 Jul 2024 at 11:02, Linus Torvalds
> <torvalds@linuxfoundation.org> wrote:
> >
> > On Tue, 30 Jul 2024 at 07:15, Arnd Bergmann <arnd@kernel.org> wrote:
> > >
> > > There is another one that I see with gcc-8 randconfigs (arm64):
> >
> > So I ended up undoing that part of my patch, so it's a non-issue [..]
> 
> I pushed out my current one.

Have you a plan to 'fix' min3() ?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Linus Torvalds July 30, 2024, 10:44 p.m. UTC | #27
On Tue, 30 Jul 2024 at 14:48, David Laight <David.Laight@aculab.com> wrote:
>
> Have you a plan to 'fix' min3() ?

I didn't have a test-case that showed it being a problem, particularly
after min() no longer expands exponentially, but maybe something like
the attached would work?

All the infrastructure is in place for it, this just uses it.

Does this work for you? Was there some particularly bad case you were
looking at that I missed?

           Linus
Linus Torvalds July 30, 2024, 11:03 p.m. UTC | #28
On Tue, 30 Jul 2024 at 15:44, Linus Torvalds
<torvalds@linuxfoundation.org> wrote:
>
> Does this work for you?

It seems to at least build cleanly here, but I'm not claiming it's all
that great.

The nested __cmp() is still rather less than optimal from an expansion
standpoint, but at least it expands only those unique temporaries.

[ Side note: having not looked at a lot of the resulting pre-processed
mess, I'm not convinced it really helps to make those unique names so
long.

  The whole "__UNIQUE_ID_" prefix looks good once, but to some degree
it actually hides the important part, which is the actual prefix and
the unique number.

  But honestly, nobody ever looks at this part normally, so it
probably doesn't matter ]

It might be possible to cut down on that by doing them in series
instead of nested, but I think that would require something like
generating a fourth unique name, and something along the lines of

    __auto_type u4 = __cmp(op, ux, uy); __cmp(op, u4, uz);

as that last line.

And no, I did *not* try that, and there might be something I'm missing.

        Linus
David Laight July 31, 2024, 8:09 a.m. UTC | #29
From: Linus Torvalds
> Sent: 31 July 2024 00:04
> 
> On Tue, 30 Jul 2024 at 15:44, Linus Torvalds
> <torvalds@linuxfoundation.org> wrote:
> >
> > Does this work for you?
> 
> It seems to at least build cleanly here, but I'm not claiming it's all
> that great.
> 
> The nested __cmp() is still rather less than optimal from an expansion
> standpoint, but at least it expands only those unique temporaries.

That is the main gain, IIRC Arnd did suggest splitting it but that is
a relatively small gain.

> [ Side note: having not looked at a lot of the resulting pre-processed
> mess, I'm not convinced it really helps to make those unique names so
> long.
> 
>   The whole "__UNIQUE_ID_" prefix looks good once, but to some degree
> it actually hides the important part, which is the actual prefix and
> the unique number.

I just passed __COUNTER__ through in my min3() patch to avoid
passing lots of parameters and then appended it to the name
giving _x_12345 (etc).
The __UNIQUE_ID_() define just seemed excessive - especially
since all compiler versions support __COUNTER__.

Just need to remember a relay #define since #define arguments get
expanded when they are substituted not at the 'call' site.
(Which is also true for __UNIQUE_ID())

That also makes it much easier to add an extra unique name.

>   But honestly, nobody ever looks at this part normally, so it
> probably doesn't matter ]

Except that when you do it is all a right PITA.
Not helped by the actual name being rammed on the end.

> 
> It might be possible to cut down on that by doing them in series
> instead of nested, but I think that would require something like
> generating a fourth unique name, and something along the lines of
> 
>     __auto_type u4 = __cmp(op, ux, uy); __cmp(op, u4, uz);
> 
> as that last line.
> 
> And no, I did *not* try that, and there might be something I'm missing.

If you have to pass through a 'u4' name that could easily take longer.

	David

> 
>         Linus

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Arnd Bergmann July 31, 2024, 10:50 a.m. UTC | #30
On Wed, Jul 31, 2024, at 10:09, David Laight wrote:
> From: Linus Torvalds
>> Sent: 31 July 2024 00:04
>> 
>> On Tue, 30 Jul 2024 at 15:44, Linus Torvalds
>> <torvalds@linuxfoundation.org> wrote:
>> >
>> > Does this work for you?
>> 
>> It seems to at least build cleanly here, but I'm not claiming it's all
>> that great.
>> 
>> The nested __cmp() is still rather less than optimal from an expansion
>> standpoint, but at least it expands only those unique temporaries.
>
> That is the main gain, IIRC Arnd did suggest splitting it but that is
> a relatively small gain.

Right, I suggested a split version earlier, but that was without
the __careful_op3, something like 

#define min3(x, y, z) ({ __auto_type __xy = min(x, y); min(__xy, z); })

I think Linus' approach with __careful_op3() is better here because
it handles more corner cases with constant arguments, otherwise
the two approaches are equivalent in how they expand the arguments.

Doing both __careful_op3() and another temporary won't add any
advantages that I can see either.

     Arnd
Linus Torvalds July 31, 2024, 3:38 p.m. UTC | #31
On Wed, 31 Jul 2024 at 01:10, David Laight <David.Laight@aculab.com> wrote:
>
> The __UNIQUE_ID_() define just seemed excessive - especially
> since all compiler versions support __COUNTER__.

Yes, we could probably just simplify it.

The thing is, "all compiler versions support __COUNTER__" wasn't
historically true.

We used to have this:

  /* Not-quite-unique ID. */
  #ifndef __UNIQUE_ID
  # define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
  #endif

because __COUNTER__ is such a new-fangled thing and only got
introduced in gcc-4 or something like that.

So that "prefix" literally exists because it literally wasn't unique
enough without it.

And the "__UNIQUE_ID_" thing is probably because that way it was
clearer what was going on when something went wrong.

But together they really end up being a somewhat unreadable mess.

That said, I did end up liking the "prefix" part when looking at
expansions, because it helps show "which" expansion it is (ie "x_123"
and "y_124" were clearer than just some pure counter value that
doesn't have any relationship to the origin at all in the name).

But I did change it to "x_" from "__x", because that way it was
minimal, yet clearly separate from the counter number (ie "x_123" was
better than "__x123").

It was the repeated useless "__UNIQUE_ID_" part of the expansion that
ended up annoying. Not quite annoying enough to change it to just
"___" or something, but I was close.

           Linus
David Laight July 31, 2024, 3:56 p.m. UTC | #32
From: Linus Torvalds
> Sent: 31 July 2024 16:38
> On Wed, 31 Jul 2024 at 01:10, David Laight <David.Laight@aculab.com> wrote:
> >
> > The __UNIQUE_ID_() define just seemed excessive - especially
> > since all compiler versions support __COUNTER__.
> 
> Yes, we could probably just simplify it.
...
> So that "prefix" literally exists because it literally wasn't unique
> enough without it.

I don't remember that far back :-)

> And the "__UNIQUE_ID_" thing is probably because that way it was
> clearer what was going on when something went wrong.
> 
> But together they really end up being a somewhat unreadable mess.
> 
> That said, I did end up liking the "prefix" part when looking at
> expansions, because it helps show "which" expansion it is (ie "x_123"
> and "y_124" were clearer than just some pure counter value that
> doesn't have any relationship to the origin at all in the name).

I guess that once the caller-supplied prefix was added the fixed
__UNIQUE_ID_ bit could just have been removed.

> But I did change it to "x_" from "__x", because that way it was
> minimal, yet clearly separate from the counter number (ie "x_123" was
> better than "__x123").

Indeed...

I got annoyed because the unique 'x' and 'y' for min() end up
having differ numbers - which can make it harder see what is going
on with nested expansions.
I might even have done a global replace to get rid of the __UNIQUE_ID_
text in an attempt to make the expansions readable.

Perhaps something like:
#define do_foo(x, uniq) ({ \
	__auto_type _x_##uniq = z; \
	..

#define foo_relay(x, counter) do_foo(x, counter)
#define foo(x) foo_relay(x, __COUNTER__)
is the way to organise it.
Since you don't get a unique number until __COUNTER__ is expanded
inside foo_relay().

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Linus Torvalds July 31, 2024, 4:04 p.m. UTC | #33
On Wed, 31 Jul 2024 at 08:57, David Laight <David.Laight@aculab.com> wrote:
>
> Perhaps something like:
> #define do_foo(x, uniq) ({ \
>         __auto_type _x_##uniq = z; \

I  like it.

Not quite enough to do it now, though.

But I did commit my min3/max3 patch, because when I looked at it a bit
more, I realized that the old min3/max3 implementation was actually
completely and fundamentally broken, and has been for a decade since
it was rewritten in 2014.

I don't think anybody has actually hit the bug in practice (it will
cast away significant bits if you use it with just the right types),
but when I looked at it again I was like "Really? We've had that
horror for a decade?"

       Linus
diff mbox series

Patch

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index a7ef65f78933..cea63a8ac80f 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -57,26 +57,6 @@ 
 		__cmp(op, x, y),				\
 		__cmp_once(op, x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y)))
 
-#define __clamp(val, lo, hi)	\
-	((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
-
-#define __clamp_once(val, lo, hi, unique_val, unique_lo, unique_hi) ({		\
-	typeof(val) unique_val = (val);						\
-	typeof(lo) unique_lo = (lo);						\
-	typeof(hi) unique_hi = (hi);						\
-	static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)), 	\
-			(lo) <= (hi), true),					\
-		"clamp() low limit " #lo " greater than high limit " #hi);	\
-	static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");	\
-	static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");	\
-	__clamp(unique_val, unique_lo, unique_hi); })
-
-#define __careful_clamp(val, lo, hi) ({					\
-	__builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),	\
-		__clamp(val, lo, hi),					\
-		__clamp_once(val, lo, hi, __UNIQUE_ID(__val),		\
-			     __UNIQUE_ID(__lo), __UNIQUE_ID(__hi))); })
-
 /**
  * min - return minimum of two values of the same or compatible types
  * @x: first value
@@ -124,6 +104,22 @@ 
  */
 #define max3(x, y, z) max((typeof(x))max(x, y), z)
 
+/**
+ * min_t - return minimum of two values, using the specified type
+ * @type: data type to use
+ * @x: first value
+ * @y: second value
+ */
+#define min_t(type, x, y)	__careful_cmp(min, (type)(x), (type)(y))
+
+/**
+ * max_t - return maximum of two values, using the specified type
+ * @type: data type to use
+ * @x: first value
+ * @y: second value
+ */
+#define max_t(type, x, y)	__careful_cmp(max, (type)(x), (type)(y))
+
 /**
  * min_not_zero - return the minimum that is _not_ zero, unless both are zero
  * @x: value1
@@ -134,39 +130,60 @@ 
 	typeof(y) __y = (y);			\
 	__x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); })
 
+#define __clamp(val, lo, hi)	\
+	((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
+
+#define __clamp_once(val, lo, hi, unique_val, unique_lo, unique_hi) ({		\
+	typeof(val) unique_val = (val);						\
+	typeof(lo) unique_lo = (lo);						\
+	typeof(hi) unique_hi = (hi);						\
+	static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),	\
+			(lo) <= (hi), true),					\
+		"clamp() low limit " #lo " greater than high limit " #hi);	\
+	static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");	\
+	static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");	\
+	__clamp(unique_val, unique_lo, unique_hi); })
+
+#define __careful_clamp(val, lo, hi) ({					\
+	__builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),	\
+		__clamp(val, lo, hi),					\
+		__clamp_once(val, lo, hi, __UNIQUE_ID(__val),		\
+			     __UNIQUE_ID(__lo), __UNIQUE_ID(__hi))); })
+
 /**
  * clamp - return a value clamped to a given range with strict typechecking
  * @val: current value
  * @lo: lowest allowable value
  * @hi: highest allowable value
  *
- * This macro does strict typechecking of @lo/@hi to make sure they are of the
- * same type as @val.  See the unnecessary pointer comparisons.
+ * This macro checks that @val, @lo and @hi have the same signedness.
  */
 #define clamp(val, lo, hi) __careful_clamp(val, lo, hi)
 
-/*
- * ..and if you can't take the strict
- * types, you can specify one yourself.
- *
- * Or not use min/max/clamp at all, of course.
- */
-
 /**
- * min_t - return minimum of two values, using the specified type
- * @type: data type to use
- * @x: first value
- * @y: second value
+ * clamp_t - return a value clamped to a given range using a given type
+ * @type: the type of variable to use
+ * @val: current value
+ * @lo: minimum allowable value
+ * @hi: maximum allowable value
+ *
+ * This macro does no typechecking and uses temporary variables of type
+ * @type to make all the comparisons.
  */
-#define min_t(type, x, y)	__careful_cmp(min, (type)(x), (type)(y))
+#define clamp_t(type, val, lo, hi) __careful_clamp((type)(val), (type)(lo), (type)(hi))
 
 /**
- * max_t - return maximum of two values, using the specified type
- * @type: data type to use
- * @x: first value
- * @y: second value
+ * clamp_val - return a value clamped to a given range using val's type
+ * @val: current value
+ * @lo: minimum allowable value
+ * @hi: maximum allowable value
+ *
+ * This macro does no typechecking and uses temporary variables of whatever
+ * type the input argument @val is.  This is useful when @val is an unsigned
+ * type and @lo and @hi are literals that will otherwise be assigned a signed
+ * integer type.
  */
-#define max_t(type, x, y)	__careful_cmp(max, (type)(x), (type)(y))
+#define clamp_val(val, lo, hi) clamp_t(typeof(val), val, lo, hi)
 
 /*
  * Do not check the array parameter using __must_be_array().
@@ -211,31 +228,6 @@ 
  */
 #define max_array(array, len) __minmax_array(max, array, len)
 
-/**
- * clamp_t - return a value clamped to a given range using a given type
- * @type: the type of variable to use
- * @val: current value
- * @lo: minimum allowable value
- * @hi: maximum allowable value
- *
- * This macro does no typechecking and uses temporary variables of type
- * @type to make all the comparisons.
- */
-#define clamp_t(type, val, lo, hi) __careful_clamp((type)(val), (type)(lo), (type)(hi))
-
-/**
- * clamp_val - return a value clamped to a given range using val's type
- * @val: current value
- * @lo: minimum allowable value
- * @hi: maximum allowable value
- *
- * This macro does no typechecking and uses temporary variables of whatever
- * type the input argument @val is.  This is useful when @val is an unsigned
- * type and @lo and @hi are literals that will otherwise be assigned a signed
- * integer type.
- */
-#define clamp_val(val, lo, hi) clamp_t(typeof(val), val, lo, hi)
-
 static inline bool in_range64(u64 val, u64 start, u64 len)
 {
 	return (val - start) < len;