Message ID | 20230504193924.3305496-6-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86: Fix transient build breakage with featureset additions | expand |
On 04.05.2023 21:39, Andrew Cooper wrote: > When adding new words to a featureset, there is a reasonable amount of > boilerplate and it is preforable to split the addition into multiple patches. > > GCC 12 spotted a real (transient) error which occurs when splitting additions > like this. Right now, FEATURESET_NR_ENTRIES is dynamically generated from the > highest numeric XEN_CPUFEATURE() value, and can be less than what the > FEATURESET_* constants suggest the length of a featureset bitmap ought to be. > > This causes the policy <-> featureset converters to genuinely access > out-of-bounds on the featureset array. > > Rework X86_NR_FEAT to be related to FEATURESET_* alone, allowing it > specifically to grow larger than FEATURESET_NR_ENTRIES. > > Reported-by: Jan Beulich <jbeulich@suse.com> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> While, like you, I could live with the previous patch even if I don't particularly like it, I'm not convinced of the route you take here. Can't we instead improve build-time checking, so the issue spotted late in the build by gcc12 can be spotted earlier and/or be connected better to the underlying reason? One idea I have would deal with another aspect I don't like about our present XEN_CPUFEATURE() as well: The *32 that's there in every use of the macro. How about XEN_CPUFEATURE(FSRCS, 10, 12) /*A Fast Short REP CMPSB/SCASB */ as the common use and e.g. XEN_CPUFEATURE(16) or (if that ends up easier in gen-cpuid-py and/or the public header) something like XEN_CPUFEATURE(, 16, ) as the placeholder required for (at least trailing) unpopulated slots? Of course the macro used may also be one of a different name, which may even be necessary to keep the public header reasonably simple; maybe as much as avoiding use of compiler extensions there. (This would then mean to leave alone XEN_CPUFEATURE(), and my secondary adjustment would perhaps become an independent change to make.) > To preempt what I expect will be the first review question, no FEATURESET_* > can't become an enumeration, because the constants undergo token concatination > in the preprocess as part of making DECL_BITFIELD() work. Just as a remark: I had trouble understanding this. Which was a result of you referring to token concatenation being the problem (which is fine when the results are enumerators), when really the issue is with the result of the concatenation wanting to be expanded to a literal number. Then again - do CPUID_BITFIELD_<n> really need to be named that way? Couldn't they equally well be CPUID_BITFIELD_1d, CPUID_BITFIELD_e1c, and alike, thus removing the need for intermediate macro expansion? Jan
On 08/05/2023 8:45 am, Jan Beulich wrote: > On 04.05.2023 21:39, Andrew Cooper wrote: >> When adding new words to a featureset, there is a reasonable amount of >> boilerplate and it is preforable to split the addition into multiple patches. >> >> GCC 12 spotted a real (transient) error which occurs when splitting additions >> like this. Right now, FEATURESET_NR_ENTRIES is dynamically generated from the >> highest numeric XEN_CPUFEATURE() value, and can be less than what the >> FEATURESET_* constants suggest the length of a featureset bitmap ought to be. >> >> This causes the policy <-> featureset converters to genuinely access >> out-of-bounds on the featureset array. >> >> Rework X86_NR_FEAT to be related to FEATURESET_* alone, allowing it >> specifically to grow larger than FEATURESET_NR_ENTRIES. >> >> Reported-by: Jan Beulich <jbeulich@suse.com> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > While, like you, I could live with the previous patch even if I don't > particularly like it, I'm not convinced of the route you take here. It's the route you tentatively agreed to in https://lore.kernel.org/xen-devel/a282c338-98ab-6c3f-314b-267a5a82bad1@suse.com/ > Can't > we instead improve build-time checking, so the issue spotted late in the > build by gcc12 can be spotted earlier and/or be connected better to the > underlying reason? I don't understand what you mean by this. For the transient period of time, Xen's idea of a featureset *is* longer the autogen idea, hence the work in this patch to decouple the two. > > One idea I have would deal with another aspect I don't like about our > present XEN_CPUFEATURE() as well: The *32 that's there in every use of > the macro. How about > > XEN_CPUFEATURE(FSRCS, 10, 12) /*A Fast Short REP CMPSB/SCASB */ > > as the common use and e.g. > > XEN_CPUFEATURE(16) > > or (if that ends up easier in gen-cpuid-py and/or the public header) > something like > > XEN_CPUFEATURE(, 16, ) > > as the placeholder required for (at least trailing) unpopulated slots? Of > course the macro used may also be one of a different name, which may even > be necessary to keep the public header reasonably simple; maybe as much > as avoiding use of compiler extensions there. (This would then mean to > leave alone XEN_CPUFEATURE(), and my secondary adjustment would perhaps > become an independent change to make.) Honestly, I don't want to hide the *32 part of the expression. This logic is already magic enough. If we were to do something like this, I don't see what's wrong with just having the value as a regular define at the end anyway. One way or another with this approach, something needs updating in the tail of cpufeatureset.h, and gen-cpuid.py can easily parse for a specific named constant, and it will be less magic than overloading XEN_CPUFEATURE(). >> To preempt what I expect will be the first review question, no FEATURESET_* >> can't become an enumeration, because the constants undergo token concatination >> in the preprocess as part of making DECL_BITFIELD() work. > Just as a remark: I had trouble understanding this. Which was a result of > you referring to token concatenation being the problem (which is fine when > the results are enumerators), when really the issue is with the result of > the concatenation wanting to be expanded to a literal number. > > Then again - do CPUID_BITFIELD_<n> really need to be named that way? > Couldn't they equally well be CPUID_BITFIELD_1d, CPUID_BITFIELD_e1c, and > alike, thus removing the need for intermediate macro expansion? gen-cpuid.py doesn't know the short names; only Xen does, which is why the expansion needs to know the name->word mapping. I suppose this can be fixed, but it will require more magic comments and more parsing to achieve. ~Andrew
On 09.05.2023 16:03, Andrew Cooper wrote: > On 08/05/2023 8:45 am, Jan Beulich wrote: >> On 04.05.2023 21:39, Andrew Cooper wrote: >>> When adding new words to a featureset, there is a reasonable amount of >>> boilerplate and it is preforable to split the addition into multiple patches. >>> >>> GCC 12 spotted a real (transient) error which occurs when splitting additions >>> like this. Right now, FEATURESET_NR_ENTRIES is dynamically generated from the >>> highest numeric XEN_CPUFEATURE() value, and can be less than what the >>> FEATURESET_* constants suggest the length of a featureset bitmap ought to be. >>> >>> This causes the policy <-> featureset converters to genuinely access >>> out-of-bounds on the featureset array. >>> >>> Rework X86_NR_FEAT to be related to FEATURESET_* alone, allowing it >>> specifically to grow larger than FEATURESET_NR_ENTRIES. >>> >>> Reported-by: Jan Beulich <jbeulich@suse.com> >>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> While, like you, I could live with the previous patch even if I don't >> particularly like it, I'm not convinced of the route you take here. > > It's the route you tentatively agreed to in > https://lore.kernel.org/xen-devel/a282c338-98ab-6c3f-314b-267a5a82bad1@suse.com/ Right. Yet I deliberately said "may be the best" there, as something better might turn up. And getting the two numbers to always agree, as suggested, might end up being better. >> Can't >> we instead improve build-time checking, so the issue spotted late in the >> build by gcc12 can be spotted earlier and/or be connected better to the >> underlying reason? > > I don't understand what you mean by this. For the transient period of > time, Xen's idea of a featureset *is* longer the autogen idea, hence the > work in this patch to decouple the two. Well, this part of my reply was just aiming at diagnosing the issue as early as possible and as clearly as possible, such that one can easily and quickly adjust whatever is missing in a change being worked on. The main goal of course needs to be that this can't easily go entirely unnoticed (as had happened, which has prompted this attempt of yours of addressing the issue). I.e. diagnosing late is still far better than failing to (without the compiler spotting it) altogether. >> One idea I have would deal with another aspect I don't like about our >> present XEN_CPUFEATURE() as well: The *32 that's there in every use of >> the macro. How about >> >> XEN_CPUFEATURE(FSRCS, 10, 12) /*A Fast Short REP CMPSB/SCASB */ >> >> as the common use and e.g. >> >> XEN_CPUFEATURE(16) >> >> or (if that ends up easier in gen-cpuid-py and/or the public header) >> something like >> >> XEN_CPUFEATURE(, 16, ) >> >> as the placeholder required for (at least trailing) unpopulated slots? Of >> course the macro used may also be one of a different name, which may even >> be necessary to keep the public header reasonably simple; maybe as much >> as avoiding use of compiler extensions there. (This would then mean to >> leave alone XEN_CPUFEATURE(), and my secondary adjustment would perhaps >> become an independent change to make.) > > Honestly, I don't want to hide the *32 part of the expression. This > logic is already magic enough. Well, I certainly wouldn't insist, but to me it looks pretty odd to have it on all the lines. > If we were to do something like this, I don't see what's wrong with just > having the value as a regular define at the end anyway. > > One way or another with this approach, something needs updating in the > tail of cpufeatureset.h, and gen-cpuid.py can easily parse for a > specific named constant, and it will be less magic than overloading > XEN_CPUFEATURE(). If less overloading is deemed better - fine with me. Looking at the script I wasn't sure hunting for an entirely different construct would end up being more tidy. What isn't really clear to me from your reply: Are you okay with trying such an alternative approach? Or are you opposed to it? Or something in the middle, like you being okay, but only if it's not you to actually try it out? >>> To preempt what I expect will be the first review question, no FEATURESET_* >>> can't become an enumeration, because the constants undergo token concatination >>> in the preprocess as part of making DECL_BITFIELD() work. >> Just as a remark: I had trouble understanding this. Which was a result of >> you referring to token concatenation being the problem (which is fine when >> the results are enumerators), when really the issue is with the result of >> the concatenation wanting to be expanded to a literal number. >> >> Then again - do CPUID_BITFIELD_<n> really need to be named that way? >> Couldn't they equally well be CPUID_BITFIELD_1d, CPUID_BITFIELD_e1c, and >> alike, thus removing the need for intermediate macro expansion? > > gen-cpuid.py doesn't know the short names; only Xen does, which is why > the expansion needs to know the name->word mapping. > > I suppose this can be fixed, but it will require more magic comments and > more parsing to achieve. Okay, let's leave this entire aspect aside for now. It started from a not-to-be-committed remark only anyway. Jan
On 09/05/2023 3:24 pm, Jan Beulich wrote: > On 09.05.2023 16:03, Andrew Cooper wrote: >> On 08/05/2023 8:45 am, Jan Beulich wrote: >>> On 04.05.2023 21:39, Andrew Cooper wrote: >>>> When adding new words to a featureset, there is a reasonable amount of >>>> boilerplate and it is preforable to split the addition into multiple patches. >>>> >>>> GCC 12 spotted a real (transient) error which occurs when splitting additions >>>> like this. Right now, FEATURESET_NR_ENTRIES is dynamically generated from the >>>> highest numeric XEN_CPUFEATURE() value, and can be less than what the >>>> FEATURESET_* constants suggest the length of a featureset bitmap ought to be. >>>> >>>> This causes the policy <-> featureset converters to genuinely access >>>> out-of-bounds on the featureset array. >>>> >>>> Rework X86_NR_FEAT to be related to FEATURESET_* alone, allowing it >>>> specifically to grow larger than FEATURESET_NR_ENTRIES. >>>> >>>> Reported-by: Jan Beulich <jbeulich@suse.com> >>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>> While, like you, I could live with the previous patch even if I don't >>> particularly like it, I'm not convinced of the route you take here. >> It's the route you tentatively agreed to in >> https://lore.kernel.org/xen-devel/a282c338-98ab-6c3f-314b-267a5a82bad1@suse.com/ > Right. Yet I deliberately said "may be the best" there, as something > better might turn up. And getting the two numbers to always agree, as > suggested, might end up being better. Then don't write "yes" if what you actually mean is "I'd prefer a different option if possible", which is a "no". I cannot read your mind, and we both know I do not have time to waste on this task. And now I have to go and spend yet more time doing it differently. ~Andrew
On 10.05.2023 22:13, Andrew Cooper wrote: > On 09/05/2023 3:24 pm, Jan Beulich wrote: >> On 09.05.2023 16:03, Andrew Cooper wrote: >>> On 08/05/2023 8:45 am, Jan Beulich wrote: >>>> On 04.05.2023 21:39, Andrew Cooper wrote: >>>>> When adding new words to a featureset, there is a reasonable amount of >>>>> boilerplate and it is preforable to split the addition into multiple patches. >>>>> >>>>> GCC 12 spotted a real (transient) error which occurs when splitting additions >>>>> like this. Right now, FEATURESET_NR_ENTRIES is dynamically generated from the >>>>> highest numeric XEN_CPUFEATURE() value, and can be less than what the >>>>> FEATURESET_* constants suggest the length of a featureset bitmap ought to be. >>>>> >>>>> This causes the policy <-> featureset converters to genuinely access >>>>> out-of-bounds on the featureset array. >>>>> >>>>> Rework X86_NR_FEAT to be related to FEATURESET_* alone, allowing it >>>>> specifically to grow larger than FEATURESET_NR_ENTRIES. >>>>> >>>>> Reported-by: Jan Beulich <jbeulich@suse.com> >>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>>> While, like you, I could live with the previous patch even if I don't >>>> particularly like it, I'm not convinced of the route you take here. >>> It's the route you tentatively agreed to in >>> https://lore.kernel.org/xen-devel/a282c338-98ab-6c3f-314b-267a5a82bad1@suse.com/ >> Right. Yet I deliberately said "may be the best" there, as something >> better might turn up. And getting the two numbers to always agree, as >> suggested, might end up being better. > > Then don't write "yes" if what you actually mean is "I'd prefer a > different option if possible", which is a "no". > > I cannot read your mind, and we both know I do not have time to waste on > this task. > > And now I have to go and spend yet more time doing it differently. I'm sorry for that. Yet please also allow for me to re-consider an earlier voiced view, once I see things in more detail. Jan
diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c index 774c512a03bd..00416244a3d8 100644 --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -883,6 +883,13 @@ void __init init_dom0_cpuid_policy(struct domain *d) static void __init __maybe_unused build_assertions(void) { + /* + * Generally these are the same, but tend to differ when adding new + * infrastructure split across several patches. Simply confirm that the + * gen-cpuid.py X86_FEATURE_* bits fit within the bitmaps we operate on. + */ + BUILD_BUG_ON(FEATURESET_NR_ENTRIES > X86_NR_FEAT); + /* Find some more clever allocation scheme if this trips. */ BUILD_BUG_ON(sizeof(struct cpu_policy) > PAGE_SIZE); diff --git a/xen/arch/x86/include/asm/cpufeatures.h b/xen/arch/x86/include/asm/cpufeatures.h index 408ab4ba16a5..8989291bbfd6 100644 --- a/xen/arch/x86/include/asm/cpufeatures.h +++ b/xen/arch/x86/include/asm/cpufeatures.h @@ -2,10 +2,7 @@ * Explicitly intended for multiple inclusion. */ -#include <xen/lib/x86/cpuid-autogen.h> - -/* Number of capability words covered by the featureset words. */ -#define X86_NR_FEAT FEATURESET_NR_ENTRIES +#include <xen/lib/x86/cpuid-consts.h> /* Synthetic words follow the featureset words. */ #define X86_NR_SYNTH 1 diff --git a/xen/include/xen/lib/x86/cpu-policy.h b/xen/include/xen/lib/x86/cpu-policy.h index e9bda14a7595..01431de056c8 100644 --- a/xen/include/xen/lib/x86/cpu-policy.h +++ b/xen/include/xen/lib/x86/cpu-policy.h @@ -370,12 +370,12 @@ struct cpu_policy_errors * Copy the featureset words out of a cpu_policy object. */ void x86_cpu_policy_to_featureset(const struct cpu_policy *p, - uint32_t fs[FEATURESET_NR_ENTRIES]); + uint32_t fs[X86_NR_FEAT]); /** * Copy the featureset words back into a cpu_policy object. */ -void x86_cpu_featureset_to_policy(const uint32_t fs[FEATURESET_NR_ENTRIES], +void x86_cpu_featureset_to_policy(const uint32_t fs[X86_NR_FEAT], struct cpu_policy *p); static inline uint64_t cpu_policy_xcr0_max(const struct cpu_policy *p) diff --git a/xen/include/xen/lib/x86/cpuid-consts.h b/xen/include/xen/lib/x86/cpuid-consts.h index 6ca8c39a3df4..9fe931b8e31f 100644 --- a/xen/include/xen/lib/x86/cpuid-consts.h +++ b/xen/include/xen/lib/x86/cpuid-consts.h @@ -21,6 +21,8 @@ #define FEATURESET_7c1 14 /* 0x00000007:1.ecx */ #define FEATURESET_7d1 15 /* 0x00000007:1.edx */ +#define X86_NR_FEAT (FEATURESET_7d1 + 1) + #endif /* !XEN_LIB_X86_CONSTS_H */ /* diff --git a/xen/lib/x86/cpuid.c b/xen/lib/x86/cpuid.c index 68aafb404927..76f26e92af8d 100644 --- a/xen/lib/x86/cpuid.c +++ b/xen/lib/x86/cpuid.c @@ -61,7 +61,7 @@ const char *x86_cpuid_vendor_to_str(unsigned int vendor) } void x86_cpu_policy_to_featureset( - const struct cpu_policy *p, uint32_t fs[FEATURESET_NR_ENTRIES]) + const struct cpu_policy *p, uint32_t fs[X86_NR_FEAT]) { fs[FEATURESET_1d] = p->basic._1d; fs[FEATURESET_1c] = p->basic._1c; @@ -82,7 +82,7 @@ void x86_cpu_policy_to_featureset( } void x86_cpu_featureset_to_policy( - const uint32_t fs[FEATURESET_NR_ENTRIES], struct cpu_policy *p) + const uint32_t fs[X86_NR_FEAT], struct cpu_policy *p) { p->basic._1d = fs[FEATURESET_1d]; p->basic._1c = fs[FEATURESET_1c]; @@ -285,7 +285,7 @@ const uint32_t *x86_cpu_policy_lookup_deep_deps(uint32_t feature) static const uint32_t deep_features[] = INIT_DEEP_FEATURES; static const struct { uint32_t feature; - uint32_t fs[FEATURESET_NR_ENTRIES]; + uint32_t fs[X86_NR_FEAT]; } deep_deps[] = INIT_DEEP_DEPS; unsigned int start = 0, end = ARRAY_SIZE(deep_deps);
When adding new words to a featureset, there is a reasonable amount of boilerplate and it is preforable to split the addition into multiple patches. GCC 12 spotted a real (transient) error which occurs when splitting additions like this. Right now, FEATURESET_NR_ENTRIES is dynamically generated from the highest numeric XEN_CPUFEATURE() value, and can be less than what the FEATURESET_* constants suggest the length of a featureset bitmap ought to be. This causes the policy <-> featureset converters to genuinely access out-of-bounds on the featureset array. Rework X86_NR_FEAT to be related to FEATURESET_* alone, allowing it specifically to grow larger than FEATURESET_NR_ENTRIES. Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> To preempt what I expect will be the first review question, no FEATURESET_* can't become an enumeration, because the constants undergo token concatination in the preprocess as part of making DECL_BITFIELD() work. --- xen/arch/x86/cpu-policy.c | 7 +++++++ xen/arch/x86/include/asm/cpufeatures.h | 5 +---- xen/include/xen/lib/x86/cpu-policy.h | 4 ++-- xen/include/xen/lib/x86/cpuid-consts.h | 2 ++ xen/lib/x86/cpuid.c | 6 +++--- 5 files changed, 15 insertions(+), 9 deletions(-)