diff mbox series

[5/6] x86/cpu-policy: Disentangle X86_NR_FEAT and FEATURESET_NR_ENTRIES

Message ID 20230504193924.3305496-6-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86: Fix transient build breakage with featureset additions | expand

Commit Message

Andrew Cooper May 4, 2023, 7:39 p.m. UTC
When adding new words to a featureset, there is a reasonable amount of
boilerplate and it is preforable to split the addition into multiple patches.

GCC 12 spotted a real (transient) error which occurs when splitting additions
like this.  Right now, FEATURESET_NR_ENTRIES is dynamically generated from the
highest numeric XEN_CPUFEATURE() value, and can be less than what the
FEATURESET_* constants suggest the length of a featureset bitmap ought to be.

This causes the policy <-> featureset converters to genuinely access
out-of-bounds on the featureset array.

Rework X86_NR_FEAT to be related to FEATURESET_* alone, allowing it
specifically to grow larger than FEATURESET_NR_ENTRIES.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

To preempt what I expect will be the first review question, no FEATURESET_*
can't become an enumeration, because the constants undergo token concatination
in the preprocess as part of making DECL_BITFIELD() work.
---
 xen/arch/x86/cpu-policy.c              | 7 +++++++
 xen/arch/x86/include/asm/cpufeatures.h | 5 +----
 xen/include/xen/lib/x86/cpu-policy.h   | 4 ++--
 xen/include/xen/lib/x86/cpuid-consts.h | 2 ++
 xen/lib/x86/cpuid.c                    | 6 +++---
 5 files changed, 15 insertions(+), 9 deletions(-)

Comments

Jan Beulich May 8, 2023, 7:45 a.m. UTC | #1
On 04.05.2023 21:39, Andrew Cooper wrote:
> When adding new words to a featureset, there is a reasonable amount of
> boilerplate and it is preforable to split the addition into multiple patches.
> 
> GCC 12 spotted a real (transient) error which occurs when splitting additions
> like this.  Right now, FEATURESET_NR_ENTRIES is dynamically generated from the
> highest numeric XEN_CPUFEATURE() value, and can be less than what the
> FEATURESET_* constants suggest the length of a featureset bitmap ought to be.
> 
> This causes the policy <-> featureset converters to genuinely access
> out-of-bounds on the featureset array.
> 
> Rework X86_NR_FEAT to be related to FEATURESET_* alone, allowing it
> specifically to grow larger than FEATURESET_NR_ENTRIES.
> 
> Reported-by: Jan Beulich <jbeulich@suse.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

While, like you, I could live with the previous patch even if I don't
particularly like it, I'm not convinced of the route you take here. Can't
we instead improve build-time checking, so the issue spotted late in the
build by gcc12 can be spotted earlier and/or be connected better to the
underlying reason?

One idea I have would deal with another aspect I don't like about our
present XEN_CPUFEATURE() as well: The *32 that's there in every use of
the macro. How about

XEN_CPUFEATURE(FSRCS,        10, 12) /*A  Fast Short REP CMPSB/SCASB */

as the common use and e.g.

XEN_CPUFEATURE(16)

or (if that ends up easier in gen-cpuid-py and/or the public header)
something like

XEN_CPUFEATURE(, 16, )

as the placeholder required for (at least trailing) unpopulated slots? Of
course the macro used may also be one of a different name, which may even
be necessary to keep the public header reasonably simple; maybe as much
as avoiding use of compiler extensions there. (This would then mean to
leave alone XEN_CPUFEATURE(), and my secondary adjustment would perhaps
become an independent change to make.)

> To preempt what I expect will be the first review question, no FEATURESET_*
> can't become an enumeration, because the constants undergo token concatination
> in the preprocess as part of making DECL_BITFIELD() work.

Just as a remark: I had trouble understanding this. Which was a result of
you referring to token concatenation being the problem (which is fine when
the results are enumerators), when really the issue is with the result of
the concatenation wanting to be expanded to a literal number.

Then again - do CPUID_BITFIELD_<n> really need to be named that way?
Couldn't they equally well be CPUID_BITFIELD_1d, CPUID_BITFIELD_e1c, and
alike, thus removing the need for intermediate macro expansion?

Jan
Andrew Cooper May 9, 2023, 2:03 p.m. UTC | #2
On 08/05/2023 8:45 am, Jan Beulich wrote:
> On 04.05.2023 21:39, Andrew Cooper wrote:
>> When adding new words to a featureset, there is a reasonable amount of
>> boilerplate and it is preforable to split the addition into multiple patches.
>>
>> GCC 12 spotted a real (transient) error which occurs when splitting additions
>> like this.  Right now, FEATURESET_NR_ENTRIES is dynamically generated from the
>> highest numeric XEN_CPUFEATURE() value, and can be less than what the
>> FEATURESET_* constants suggest the length of a featureset bitmap ought to be.
>>
>> This causes the policy <-> featureset converters to genuinely access
>> out-of-bounds on the featureset array.
>>
>> Rework X86_NR_FEAT to be related to FEATURESET_* alone, allowing it
>> specifically to grow larger than FEATURESET_NR_ENTRIES.
>>
>> Reported-by: Jan Beulich <jbeulich@suse.com>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> While, like you, I could live with the previous patch even if I don't
> particularly like it, I'm not convinced of the route you take here.

It's the route you tentatively agreed to in
https://lore.kernel.org/xen-devel/a282c338-98ab-6c3f-314b-267a5a82bad1@suse.com/

> Can't
> we instead improve build-time checking, so the issue spotted late in the
> build by gcc12 can be spotted earlier and/or be connected better to the
> underlying reason?

I don't understand what you mean by this.  For the transient period of
time, Xen's idea of a featureset *is* longer the autogen idea, hence the
work in this patch to decouple the two.

>
> One idea I have would deal with another aspect I don't like about our
> present XEN_CPUFEATURE() as well: The *32 that's there in every use of
> the macro. How about
>
> XEN_CPUFEATURE(FSRCS,        10, 12) /*A  Fast Short REP CMPSB/SCASB */
>
> as the common use and e.g.
>
> XEN_CPUFEATURE(16)
>
> or (if that ends up easier in gen-cpuid-py and/or the public header)
> something like
>
> XEN_CPUFEATURE(, 16, )
>
> as the placeholder required for (at least trailing) unpopulated slots? Of
> course the macro used may also be one of a different name, which may even
> be necessary to keep the public header reasonably simple; maybe as much
> as avoiding use of compiler extensions there. (This would then mean to
> leave alone XEN_CPUFEATURE(), and my secondary adjustment would perhaps
> become an independent change to make.)

Honestly, I don't want to hide the *32 part of the expression.  This
logic is already magic enough.

If we were to do something like this, I don't see what's wrong with just
having the value as a regular define at the end anyway.

One way or another with this approach, something needs updating in the
tail of cpufeatureset.h, and gen-cpuid.py can easily parse for a
specific named constant, and it will be less magic than overloading
XEN_CPUFEATURE().

>> To preempt what I expect will be the first review question, no FEATURESET_*
>> can't become an enumeration, because the constants undergo token concatination
>> in the preprocess as part of making DECL_BITFIELD() work.
> Just as a remark: I had trouble understanding this. Which was a result of
> you referring to token concatenation being the problem (which is fine when
> the results are enumerators), when really the issue is with the result of
> the concatenation wanting to be expanded to a literal number.
>
> Then again - do CPUID_BITFIELD_<n> really need to be named that way?
> Couldn't they equally well be CPUID_BITFIELD_1d, CPUID_BITFIELD_e1c, and
> alike, thus removing the need for intermediate macro expansion?

gen-cpuid.py doesn't know the short names; only Xen does, which is why
the expansion needs to know the name->word mapping.

I suppose this can be fixed, but it will require more magic comments and
more parsing to achieve.

~Andrew
Jan Beulich May 9, 2023, 2:24 p.m. UTC | #3
On 09.05.2023 16:03, Andrew Cooper wrote:
> On 08/05/2023 8:45 am, Jan Beulich wrote:
>> On 04.05.2023 21:39, Andrew Cooper wrote:
>>> When adding new words to a featureset, there is a reasonable amount of
>>> boilerplate and it is preforable to split the addition into multiple patches.
>>>
>>> GCC 12 spotted a real (transient) error which occurs when splitting additions
>>> like this.  Right now, FEATURESET_NR_ENTRIES is dynamically generated from the
>>> highest numeric XEN_CPUFEATURE() value, and can be less than what the
>>> FEATURESET_* constants suggest the length of a featureset bitmap ought to be.
>>>
>>> This causes the policy <-> featureset converters to genuinely access
>>> out-of-bounds on the featureset array.
>>>
>>> Rework X86_NR_FEAT to be related to FEATURESET_* alone, allowing it
>>> specifically to grow larger than FEATURESET_NR_ENTRIES.
>>>
>>> Reported-by: Jan Beulich <jbeulich@suse.com>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> While, like you, I could live with the previous patch even if I don't
>> particularly like it, I'm not convinced of the route you take here.
> 
> It's the route you tentatively agreed to in
> https://lore.kernel.org/xen-devel/a282c338-98ab-6c3f-314b-267a5a82bad1@suse.com/

Right. Yet I deliberately said "may be the best" there, as something
better might turn up. And getting the two numbers to always agree, as
suggested, might end up being better.

>> Can't
>> we instead improve build-time checking, so the issue spotted late in the
>> build by gcc12 can be spotted earlier and/or be connected better to the
>> underlying reason?
> 
> I don't understand what you mean by this.  For the transient period of
> time, Xen's idea of a featureset *is* longer the autogen idea, hence the
> work in this patch to decouple the two.

Well, this part of my reply was just aiming at diagnosing the issue
as early as possible and as clearly as possible, such that one can
easily and quickly adjust whatever is missing in a change being worked
on. The main goal of course needs to be that this can't easily go
entirely unnoticed (as had happened, which has prompted this attempt
of yours of addressing the issue). I.e. diagnosing late is still far
better than failing to (without the compiler spotting it) altogether.

>> One idea I have would deal with another aspect I don't like about our
>> present XEN_CPUFEATURE() as well: The *32 that's there in every use of
>> the macro. How about
>>
>> XEN_CPUFEATURE(FSRCS,        10, 12) /*A  Fast Short REP CMPSB/SCASB */
>>
>> as the common use and e.g.
>>
>> XEN_CPUFEATURE(16)
>>
>> or (if that ends up easier in gen-cpuid-py and/or the public header)
>> something like
>>
>> XEN_CPUFEATURE(, 16, )
>>
>> as the placeholder required for (at least trailing) unpopulated slots? Of
>> course the macro used may also be one of a different name, which may even
>> be necessary to keep the public header reasonably simple; maybe as much
>> as avoiding use of compiler extensions there. (This would then mean to
>> leave alone XEN_CPUFEATURE(), and my secondary adjustment would perhaps
>> become an independent change to make.)
> 
> Honestly, I don't want to hide the *32 part of the expression.  This
> logic is already magic enough.

Well, I certainly wouldn't insist, but to me it looks pretty odd to have
it on all the lines.

> If we were to do something like this, I don't see what's wrong with just
> having the value as a regular define at the end anyway.
> 
> One way or another with this approach, something needs updating in the
> tail of cpufeatureset.h, and gen-cpuid.py can easily parse for a
> specific named constant, and it will be less magic than overloading
> XEN_CPUFEATURE().

If less overloading is deemed better - fine with me. Looking at the
script I wasn't sure hunting for an entirely different construct would
end up being more tidy.

What isn't really clear to me from your reply: Are you okay with trying
such an alternative approach? Or are you opposed to it? Or something in
the middle, like you being okay, but only if it's not you to actually
try it out?

>>> To preempt what I expect will be the first review question, no FEATURESET_*
>>> can't become an enumeration, because the constants undergo token concatination
>>> in the preprocess as part of making DECL_BITFIELD() work.
>> Just as a remark: I had trouble understanding this. Which was a result of
>> you referring to token concatenation being the problem (which is fine when
>> the results are enumerators), when really the issue is with the result of
>> the concatenation wanting to be expanded to a literal number.
>>
>> Then again - do CPUID_BITFIELD_<n> really need to be named that way?
>> Couldn't they equally well be CPUID_BITFIELD_1d, CPUID_BITFIELD_e1c, and
>> alike, thus removing the need for intermediate macro expansion?
> 
> gen-cpuid.py doesn't know the short names; only Xen does, which is why
> the expansion needs to know the name->word mapping.
> 
> I suppose this can be fixed, but it will require more magic comments and
> more parsing to achieve.

Okay, let's leave this entire aspect aside for now. It started from a
not-to-be-committed remark only anyway.

Jan
Andrew Cooper May 10, 2023, 8:13 p.m. UTC | #4
On 09/05/2023 3:24 pm, Jan Beulich wrote:
> On 09.05.2023 16:03, Andrew Cooper wrote:
>> On 08/05/2023 8:45 am, Jan Beulich wrote:
>>> On 04.05.2023 21:39, Andrew Cooper wrote:
>>>> When adding new words to a featureset, there is a reasonable amount of
>>>> boilerplate and it is preforable to split the addition into multiple patches.
>>>>
>>>> GCC 12 spotted a real (transient) error which occurs when splitting additions
>>>> like this.  Right now, FEATURESET_NR_ENTRIES is dynamically generated from the
>>>> highest numeric XEN_CPUFEATURE() value, and can be less than what the
>>>> FEATURESET_* constants suggest the length of a featureset bitmap ought to be.
>>>>
>>>> This causes the policy <-> featureset converters to genuinely access
>>>> out-of-bounds on the featureset array.
>>>>
>>>> Rework X86_NR_FEAT to be related to FEATURESET_* alone, allowing it
>>>> specifically to grow larger than FEATURESET_NR_ENTRIES.
>>>>
>>>> Reported-by: Jan Beulich <jbeulich@suse.com>
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> While, like you, I could live with the previous patch even if I don't
>>> particularly like it, I'm not convinced of the route you take here.
>> It's the route you tentatively agreed to in
>> https://lore.kernel.org/xen-devel/a282c338-98ab-6c3f-314b-267a5a82bad1@suse.com/
> Right. Yet I deliberately said "may be the best" there, as something
> better might turn up. And getting the two numbers to always agree, as
> suggested, might end up being better.

Then don't write "yes" if what you actually mean is "I'd prefer a
different option if possible", which is a "no".

I cannot read your mind, and we both know I do not have time to waste on
this task.

And now I have to go and spend yet more time doing it differently.

~Andrew
Jan Beulich May 11, 2023, 6:46 a.m. UTC | #5
On 10.05.2023 22:13, Andrew Cooper wrote:
> On 09/05/2023 3:24 pm, Jan Beulich wrote:
>> On 09.05.2023 16:03, Andrew Cooper wrote:
>>> On 08/05/2023 8:45 am, Jan Beulich wrote:
>>>> On 04.05.2023 21:39, Andrew Cooper wrote:
>>>>> When adding new words to a featureset, there is a reasonable amount of
>>>>> boilerplate and it is preforable to split the addition into multiple patches.
>>>>>
>>>>> GCC 12 spotted a real (transient) error which occurs when splitting additions
>>>>> like this.  Right now, FEATURESET_NR_ENTRIES is dynamically generated from the
>>>>> highest numeric XEN_CPUFEATURE() value, and can be less than what the
>>>>> FEATURESET_* constants suggest the length of a featureset bitmap ought to be.
>>>>>
>>>>> This causes the policy <-> featureset converters to genuinely access
>>>>> out-of-bounds on the featureset array.
>>>>>
>>>>> Rework X86_NR_FEAT to be related to FEATURESET_* alone, allowing it
>>>>> specifically to grow larger than FEATURESET_NR_ENTRIES.
>>>>>
>>>>> Reported-by: Jan Beulich <jbeulich@suse.com>
>>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> While, like you, I could live with the previous patch even if I don't
>>>> particularly like it, I'm not convinced of the route you take here.
>>> It's the route you tentatively agreed to in
>>> https://lore.kernel.org/xen-devel/a282c338-98ab-6c3f-314b-267a5a82bad1@suse.com/
>> Right. Yet I deliberately said "may be the best" there, as something
>> better might turn up. And getting the two numbers to always agree, as
>> suggested, might end up being better.
> 
> Then don't write "yes" if what you actually mean is "I'd prefer a
> different option if possible", which is a "no".
> 
> I cannot read your mind, and we both know I do not have time to waste on
> this task.
> 
> And now I have to go and spend yet more time doing it differently.

I'm sorry for that. Yet please also allow for me to re-consider an earlier
voiced view, once I see things in more detail.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index 774c512a03bd..00416244a3d8 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -883,6 +883,13 @@  void __init init_dom0_cpuid_policy(struct domain *d)
 
 static void __init __maybe_unused build_assertions(void)
 {
+    /*
+     * Generally these are the same, but tend to differ when adding new
+     * infrastructure split across several patches.  Simply confirm that the
+     * gen-cpuid.py X86_FEATURE_* bits fit within the bitmaps we operate on.
+     */
+    BUILD_BUG_ON(FEATURESET_NR_ENTRIES > X86_NR_FEAT);
+
     /* Find some more clever allocation scheme if this trips. */
     BUILD_BUG_ON(sizeof(struct cpu_policy) > PAGE_SIZE);
 
diff --git a/xen/arch/x86/include/asm/cpufeatures.h b/xen/arch/x86/include/asm/cpufeatures.h
index 408ab4ba16a5..8989291bbfd6 100644
--- a/xen/arch/x86/include/asm/cpufeatures.h
+++ b/xen/arch/x86/include/asm/cpufeatures.h
@@ -2,10 +2,7 @@ 
  * Explicitly intended for multiple inclusion.
  */
 
-#include <xen/lib/x86/cpuid-autogen.h>
-
-/* Number of capability words covered by the featureset words. */
-#define X86_NR_FEAT FEATURESET_NR_ENTRIES
+#include <xen/lib/x86/cpuid-consts.h>
 
 /* Synthetic words follow the featureset words. */
 #define X86_NR_SYNTH 1
diff --git a/xen/include/xen/lib/x86/cpu-policy.h b/xen/include/xen/lib/x86/cpu-policy.h
index e9bda14a7595..01431de056c8 100644
--- a/xen/include/xen/lib/x86/cpu-policy.h
+++ b/xen/include/xen/lib/x86/cpu-policy.h
@@ -370,12 +370,12 @@  struct cpu_policy_errors
  * Copy the featureset words out of a cpu_policy object.
  */
 void x86_cpu_policy_to_featureset(const struct cpu_policy *p,
-                                  uint32_t fs[FEATURESET_NR_ENTRIES]);
+                                  uint32_t fs[X86_NR_FEAT]);
 
 /**
  * Copy the featureset words back into a cpu_policy object.
  */
-void x86_cpu_featureset_to_policy(const uint32_t fs[FEATURESET_NR_ENTRIES],
+void x86_cpu_featureset_to_policy(const uint32_t fs[X86_NR_FEAT],
                                   struct cpu_policy *p);
 
 static inline uint64_t cpu_policy_xcr0_max(const struct cpu_policy *p)
diff --git a/xen/include/xen/lib/x86/cpuid-consts.h b/xen/include/xen/lib/x86/cpuid-consts.h
index 6ca8c39a3df4..9fe931b8e31f 100644
--- a/xen/include/xen/lib/x86/cpuid-consts.h
+++ b/xen/include/xen/lib/x86/cpuid-consts.h
@@ -21,6 +21,8 @@ 
 #define FEATURESET_7c1   14 /* 0x00000007:1.ecx    */
 #define FEATURESET_7d1   15 /* 0x00000007:1.edx    */
 
+#define X86_NR_FEAT (FEATURESET_7d1 + 1)
+
 #endif /* !XEN_LIB_X86_CONSTS_H */
 
 /*
diff --git a/xen/lib/x86/cpuid.c b/xen/lib/x86/cpuid.c
index 68aafb404927..76f26e92af8d 100644
--- a/xen/lib/x86/cpuid.c
+++ b/xen/lib/x86/cpuid.c
@@ -61,7 +61,7 @@  const char *x86_cpuid_vendor_to_str(unsigned int vendor)
 }
 
 void x86_cpu_policy_to_featureset(
-    const struct cpu_policy *p, uint32_t fs[FEATURESET_NR_ENTRIES])
+    const struct cpu_policy *p, uint32_t fs[X86_NR_FEAT])
 {
     fs[FEATURESET_1d]        = p->basic._1d;
     fs[FEATURESET_1c]        = p->basic._1c;
@@ -82,7 +82,7 @@  void x86_cpu_policy_to_featureset(
 }
 
 void x86_cpu_featureset_to_policy(
-    const uint32_t fs[FEATURESET_NR_ENTRIES], struct cpu_policy *p)
+    const uint32_t fs[X86_NR_FEAT], struct cpu_policy *p)
 {
     p->basic._1d             = fs[FEATURESET_1d];
     p->basic._1c             = fs[FEATURESET_1c];
@@ -285,7 +285,7 @@  const uint32_t *x86_cpu_policy_lookup_deep_deps(uint32_t feature)
     static const uint32_t deep_features[] = INIT_DEEP_FEATURES;
     static const struct {
         uint32_t feature;
-        uint32_t fs[FEATURESET_NR_ENTRIES];
+        uint32_t fs[X86_NR_FEAT];
     } deep_deps[] = INIT_DEEP_DEPS;
     unsigned int start = 0, end = ARRAY_SIZE(deep_deps);