Message ID | 7026f7df-1f70-0018-a6eb-b7e043b279d8@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86: correct fencing around CLFLUSH (+some tidying) | expand |
On 23/02/2022 10:13, Jan Beulich wrote: > As noted in the context of 3330013e6739 ("VT-d / x86: re-arrange cache > syncing"): While cache_writeback() has the SFENCE on the correct side of > CLFLUSHOPT, flush_area_local() doesn't. While I can't prove it due to > lacking a copy of the old SDM version, I can only assume this placement > was a result of what had been described there originally. In any event > recent versions of the SDM hve been telling us otherwise. > > For AMD (and Hygon, albeit there it's benign, since all their CPUs are > expected to support CLFLUSHOPT) the situation is more complex: MFENCE is > needed ahead and/or after CLFLUSH when the CPU doesn't also support > CLFLUSHOPT. (It's "and" in our case, as we cannot know what the caller's > needs are.) > > Fixes: 623c720fc8da3 ("x86: use CLFLUSHOPT when available") > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/cpu/common.c > +++ b/xen/arch/x86/cpu/common.c > @@ -346,9 +346,14 @@ void __init early_cpu_init(void) > c->x86_model, c->x86_model, c->x86_mask, eax); > > if (c->cpuid_level >= 7) > - cpuid_count(7, 0, &eax, &ebx, > + cpuid_count(7, 0, &eax, > + &c->x86_capability[FEATURESET_7b0], > &c->x86_capability[FEATURESET_7c0], &edx); > > + if (!(c->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) || > + cpu_has(c, X86_FEATURE_CLFLUSHOPT)) > + setup_force_cpu_cap(X86_FEATURE_CLFLUSH_NO_MFENCE); This is somewhat ugly, not only because it presumes that the early AMD implementation peculiarities are common. It also has a corner case that goes wrong when the BSP enumerates CLFLUSHOPT but later CPUs don't. In this case the workaround will be disengaged even when it's not safe to. Most importantly however, it makes the one current slow usecase (VT-d on early Intel with only CLFLUSH) even slower. I suggest inverting this workaround (and IMO, using the bug infrastructure, because that's exactly what it is) and leaving a big warning by the function saying "don't use on AMD before alternatives have run" or something. It's quite possibly a problem we'll never need to solve in practice, although my plans for overhauling CPUID scanning will probably fix it because we can move the first alternatives pass far earlier as a consequence. > + > eax = cpuid_eax(0x80000000); > if ((eax >> 16) == 0x8000 && eax >= 0x80000008) { > ebx = eax >= 0x8000001f ? cpuid_ebx(0x8000001f) : 0; > --- a/xen/arch/x86/flushtlb.c > +++ b/xen/arch/x86/flushtlb.c > @@ -245,12 +245,15 @@ unsigned int flush_area_local(const void > c->x86_clflush_size && c->x86_cache_size && sz && > ((sz >> 10) < c->x86_cache_size) ) > { > - alternative("", "sfence", X86_FEATURE_CLFLUSHOPT); > + alternative("mfence", , X86_FEATURE_CLFLUSH_NO_MFENCE); An an aside, the absence of "" is very weird parse, and only compiles because this is a macro rather than a function. I'd request that it stays, simply to make the code read more like regular C. ~Andrew
On 23.02.2022 13:33, Andrew Cooper wrote: > On 23/02/2022 10:13, Jan Beulich wrote: >> --- a/xen/arch/x86/cpu/common.c >> +++ b/xen/arch/x86/cpu/common.c >> @@ -346,9 +346,14 @@ void __init early_cpu_init(void) >> c->x86_model, c->x86_model, c->x86_mask, eax); >> >> if (c->cpuid_level >= 7) >> - cpuid_count(7, 0, &eax, &ebx, >> + cpuid_count(7, 0, &eax, >> + &c->x86_capability[FEATURESET_7b0], >> &c->x86_capability[FEATURESET_7c0], &edx); >> >> + if (!(c->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) || >> + cpu_has(c, X86_FEATURE_CLFLUSHOPT)) >> + setup_force_cpu_cap(X86_FEATURE_CLFLUSH_NO_MFENCE); > > This is somewhat ugly, not only because it presumes that the early AMD > implementation peculiarities are common. > > It also has a corner case that goes wrong when the BSP enumerates > CLFLUSHOPT but later CPUs don't. In this case the workaround will be > disengaged even when it's not safe to. You realize though that this cannot be taken care of when alternatives patching is involved? Are you suggesting to not use patching just to deal with an asymmetry we don't really deal with anywhere else? > Most importantly however, it makes the one current slow usecase (VT-d on > early Intel with only CLFLUSH) even slower. > > > I suggest inverting this workaround (and IMO, using the bug > infrastructure, because that's exactly what it is) and leaving a big > warning by the function saying "don't use on AMD before alternatives > have run" or something. It's quite possibly a problem we'll never need > to solve in practice, although my plans for overhauling CPUID scanning > will probably fix it because we can move the first alternatives pass far > earlier as a consequence. I've switched to marking this BUG, but I'm not sure about such a comment: It really depends on the use whether it would be safe without the MFENCEs. (We also aren't aware of problems, despite them having been missing forever.) Furthermore it's not overly likely for anyone to look here when adding a new use of FLUSH_CACHE. I'd therefore rather consider it best effort behavior until patching has taken place. >> --- a/xen/arch/x86/flushtlb.c >> +++ b/xen/arch/x86/flushtlb.c >> @@ -245,12 +245,15 @@ unsigned int flush_area_local(const void >> c->x86_clflush_size && c->x86_cache_size && sz && >> ((sz >> 10) < c->x86_cache_size) ) >> { >> - alternative("", "sfence", X86_FEATURE_CLFLUSHOPT); >> + alternative("mfence", , X86_FEATURE_CLFLUSH_NO_MFENCE); > > An an aside, the absence of "" is very weird parse, and only compiles > because this is a macro rather than a function. > > I'd request that it stays, simply to make the code read more like regular C. To be honest I was half expecting this feedback. For now I've put back the quotes, but I have a change halfway ready which will eliminate the need for quotes in the same cases where the assembler macros don't require their use (as you may guess: by using the assembler macros instead of maintaining redundant C infrastructure). I guess at that point it would become a little inconsistent if quotes were present just to express "empty". Also if I'm not mistaken this isn't the only place where we make use of macro arguments being allowed to be empty. Jan
--- a/xen/arch/x86/cpu/common.c +++ b/xen/arch/x86/cpu/common.c @@ -346,9 +346,14 @@ void __init early_cpu_init(void) c->x86_model, c->x86_model, c->x86_mask, eax); if (c->cpuid_level >= 7) - cpuid_count(7, 0, &eax, &ebx, + cpuid_count(7, 0, &eax, + &c->x86_capability[FEATURESET_7b0], &c->x86_capability[FEATURESET_7c0], &edx); + if (!(c->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) || + cpu_has(c, X86_FEATURE_CLFLUSHOPT)) + setup_force_cpu_cap(X86_FEATURE_CLFLUSH_NO_MFENCE); + eax = cpuid_eax(0x80000000); if ((eax >> 16) == 0x8000 && eax >= 0x80000008) { ebx = eax >= 0x8000001f ? cpuid_ebx(0x8000001f) : 0; --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -245,12 +245,15 @@ unsigned int flush_area_local(const void c->x86_clflush_size && c->x86_cache_size && sz && ((sz >> 10) < c->x86_cache_size) ) { - alternative("", "sfence", X86_FEATURE_CLFLUSHOPT); + alternative("mfence", , X86_FEATURE_CLFLUSH_NO_MFENCE); for ( i = 0; i < sz; i += c->x86_clflush_size ) alternative_input("ds; clflush %0", "data16 clflush %0", /* clflushopt */ X86_FEATURE_CLFLUSHOPT, "m" (((const char *)va)[i])); + alternative_2("mfence", + , X86_FEATURE_CLFLUSH_NO_MFENCE, + "sfence", X86_FEATURE_CLFLUSHOPT); flags &= ~FLUSH_CACHE; } else @@ -274,6 +277,8 @@ void cache_writeback(const void *addr, u unsigned int clflush_size = current_cpu_data.x86_clflush_size ?: 16; const void *end = addr + size; + alternative("mfence", , X86_FEATURE_CLFLUSH_NO_MFENCE); + addr -= (unsigned long)addr & (clflush_size - 1); for ( ; addr < end; addr += clflush_size ) { --- a/xen/arch/x86/include/asm/cpufeatures.h +++ b/xen/arch/x86/include/asm/cpufeatures.h @@ -24,7 +24,7 @@ XEN_CPUFEATURE(APERFMPERF, X86_SY XEN_CPUFEATURE(MFENCE_RDTSC, X86_SYNTH( 9)) /* MFENCE synchronizes RDTSC */ XEN_CPUFEATURE(XEN_SMEP, X86_SYNTH(10)) /* SMEP gets used by Xen itself */ XEN_CPUFEATURE(XEN_SMAP, X86_SYNTH(11)) /* SMAP gets used by Xen itself */ -/* Bit 12 - unused. */ +XEN_CPUFEATURE(CLFLUSH_NO_MFENCE, X86_SYNTH(12)) /* No MFENCE needed to serialize CLFLUSH */ XEN_CPUFEATURE(IND_THUNK_LFENCE, X86_SYNTH(13)) /* Use IND_THUNK_LFENCE */ XEN_CPUFEATURE(IND_THUNK_JMP, X86_SYNTH(14)) /* Use IND_THUNK_JMP */ XEN_CPUFEATURE(SC_NO_BRANCH_HARDEN, X86_SYNTH(15)) /* (Disable) Conditional branch hardening */
As noted in the context of 3330013e6739 ("VT-d / x86: re-arrange cache syncing"): While cache_writeback() has the SFENCE on the correct side of CLFLUSHOPT, flush_area_local() doesn't. While I can't prove it due to lacking a copy of the old SDM version, I can only assume this placement was a result of what had been described there originally. In any event recent versions of the SDM hve been telling us otherwise. For AMD (and Hygon, albeit there it's benign, since all their CPUs are expected to support CLFLUSHOPT) the situation is more complex: MFENCE is needed ahead and/or after CLFLUSH when the CPU doesn't also support CLFLUSHOPT. (It's "and" in our case, as we cannot know what the caller's needs are.) Fixes: 623c720fc8da3 ("x86: use CLFLUSHOPT when available") Signed-off-by: Jan Beulich <jbeulich@suse.com>