Message ID | 20220819171024.1766857-1-dave@stgolabs.net |
---|---|
State | New, archived |
Headers | show |
Series | [v2] arch/cacheflush: Introduce flush_all_caches() | expand |
On Fri, Aug 19, 2022 at 10:10:24AM -0700, Davidlohr Bueso wrote: > With CXL security features, global CPU cache flushing nvdimm requirements > are no longer specific to that subsystem, even beyond the scope of > security_ops. CXL will need such semantics for features not necessarily > limited to persistent memory. > > The functionality this is enabling is to be able to instantaneously > secure erase potentially terabytes of memory at once and the kernel > needs to be sure that none of the data from before the secure is still > present in the cache. It is also used when unlocking a memory device > where speculative reads and firmware accesses could have cached poison > from before the device was unlocked. > > This capability is typically only used once per-boot (for unlock), or > once per bare metal provisioning event (secure erase), like when handing > off the system to another tenant or decomissioning a device. That small > scope plus the fact that none of this is available to a VM limits the > potential damage. > > While the scope of this is for physical address space, add a new > flush_all_caches() in cacheflush headers such that each architecture > can define it, when capable. For x86 just use the wbinvd hammer and > prevent any other arch from being capable. > > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> > --- > > Changes from v1 (https://lore.kernel.org/all/20220815160706.tqd42dv24tgb7x7y@offworld/): > - Added comments and improved changelog to reflect this is > routine should be avoided and not considered a general API (Peter, Dan). > > arch/x86/include/asm/cacheflush.h | 4 +++ > drivers/acpi/nfit/intel.c | 41 ++++++++++++++----------------- > include/asm-generic/cacheflush.h | 31 +++++++++++++++++++++++ > 3 files changed, 53 insertions(+), 23 deletions(-) > > diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h > index b192d917a6d0..ac4d4fd4e508 100644 > --- a/arch/x86/include/asm/cacheflush.h > +++ b/arch/x86/include/asm/cacheflush.h > @@ -10,4 +10,8 @@ > > void clflush_cache_range(void *addr, unsigned int size); > > +/* see comments in the stub version */ > +#define flush_all_caches() \ > + do { wbinvd_on_all_cpus(); } while(0) > + > #endif /* _ASM_X86_CACHEFLUSH_H */ > diff --git a/drivers/acpi/nfit/intel.c b/drivers/acpi/nfit/intel.c > index 8dd792a55730..f2f6c31e6ab7 100644 > --- a/drivers/acpi/nfit/intel.c > +++ b/drivers/acpi/nfit/intel.c > @@ -4,6 +4,7 @@ > #include <linux/ndctl.h> > #include <linux/acpi.h> > #include <asm/smp.h> > +#include <linux/cacheflush.h> > #include "intel.h" > #include "nfit.h" > > @@ -190,8 +191,6 @@ static int intel_security_change_key(struct nvdimm *nvdimm, > } > } > > -static void nvdimm_invalidate_cache(void); > - > static int __maybe_unused intel_security_unlock(struct nvdimm *nvdimm, > const struct nvdimm_key_data *key_data) > { > @@ -210,6 +209,9 @@ static int __maybe_unused intel_security_unlock(struct nvdimm *nvdimm, > }; > int rc; > > + if (!flush_all_caches_capable()) > + return -EINVAL; > + > if (!test_bit(NVDIMM_INTEL_UNLOCK_UNIT, &nfit_mem->dsm_mask)) > return -ENOTTY; > > @@ -228,7 +230,7 @@ static int __maybe_unused intel_security_unlock(struct nvdimm *nvdimm, > } > > /* DIMM unlocked, invalidate all CPU caches before we read it */ > - nvdimm_invalidate_cache(); > + flush_all_caches(); > > return 0; > } > @@ -294,11 +296,14 @@ static int __maybe_unused intel_security_erase(struct nvdimm *nvdimm, > }, > }; > > + if (!flush_all_caches_capable()) > + return -EINVAL; > + > if (!test_bit(cmd, &nfit_mem->dsm_mask)) > return -ENOTTY; > > /* flush all cache before we erase DIMM */ > - nvdimm_invalidate_cache(); > + flush_all_caches(); > memcpy(nd_cmd.cmd.passphrase, key->data, > sizeof(nd_cmd.cmd.passphrase)); > rc = nvdimm_ctl(nvdimm, ND_CMD_CALL, &nd_cmd, sizeof(nd_cmd), NULL); > @@ -318,7 +323,7 @@ static int __maybe_unused intel_security_erase(struct nvdimm *nvdimm, > } > > /* DIMM erased, invalidate all CPU caches before we read it */ > - nvdimm_invalidate_cache(); > + flush_all_caches(); > return 0; > } > > @@ -338,6 +343,9 @@ static int __maybe_unused intel_security_query_overwrite(struct nvdimm *nvdimm) > }, > }; > > + if (!flush_all_caches_capable()) > + return -EINVAL; > + > if (!test_bit(NVDIMM_INTEL_QUERY_OVERWRITE, &nfit_mem->dsm_mask)) > return -ENOTTY; > > @@ -355,7 +363,7 @@ static int __maybe_unused intel_security_query_overwrite(struct nvdimm *nvdimm) > } > > /* flush all cache before we make the nvdimms available */ > - nvdimm_invalidate_cache(); > + flush_all_caches(); > return 0; > } > > @@ -377,11 +385,14 @@ static int __maybe_unused intel_security_overwrite(struct nvdimm *nvdimm, > }, > }; > > + if (!flush_all_caches_capable()) > + return -EINVAL; > + > if (!test_bit(NVDIMM_INTEL_OVERWRITE, &nfit_mem->dsm_mask)) > return -ENOTTY; > > /* flush all cache before we erase DIMM */ > - nvdimm_invalidate_cache(); > + flush_all_caches(); > memcpy(nd_cmd.cmd.passphrase, nkey->data, > sizeof(nd_cmd.cmd.passphrase)); > rc = nvdimm_ctl(nvdimm, ND_CMD_CALL, &nd_cmd, sizeof(nd_cmd), NULL); > @@ -401,22 +412,6 @@ static int __maybe_unused intel_security_overwrite(struct nvdimm *nvdimm, > } > } > > -/* > - * TODO: define a cross arch wbinvd equivalent when/if > - * NVDIMM_FAMILY_INTEL command support arrives on another arch. > - */ > -#ifdef CONFIG_X86 > -static void nvdimm_invalidate_cache(void) > -{ > - wbinvd_on_all_cpus(); > -} > -#else > -static void nvdimm_invalidate_cache(void) > -{ > - WARN_ON_ONCE("cache invalidation required after unlock\n"); > -} > -#endif > - > static const struct nvdimm_security_ops __intel_security_ops = { > .get_flags = intel_security_flags, > .freeze = intel_security_freeze, > diff --git a/include/asm-generic/cacheflush.h b/include/asm-generic/cacheflush.h > index 4f07afacbc23..89f310f92498 100644 > --- a/include/asm-generic/cacheflush.h > +++ b/include/asm-generic/cacheflush.h > @@ -115,4 +115,35 @@ static inline void flush_cache_vunmap(unsigned long start, unsigned long end) > memcpy(dst, src, len) > #endif > > +/* > + * Flush the entire caches across all CPUs, however: > + * > + * YOU DO NOT WANT TO USE THIS FUNCTION. > + * > + * It is considered a big hammer and can affect overall > + * system performance and increase latency/response times. > + * As such it is not for general usage, but for specific use > + * cases involving instantaneously invalidating wide swaths > + * of memory on bare metal. > + > + * Unlike the APIs above, this function can be defined on > + * architectures which have VIPT or PIPT caches, and thus is > + * beyond the scope of virtual to physical mappings/page > + * tables changing. > + * > + * The limitation here is that the architectures that make > + * use of it must can actually comply with the semantics, "must can"? Did you mean "must"? > + * such as those which caches are in a consistent state. The > + * caller can verify the situation early on. > + */ > +#ifndef flush_all_caches > +# define flush_all_caches_capable() false > +static inline void flush_all_caches(void) > +{ > + WARN_ON_ONCE("cache invalidation required\n"); > +} With the addition of flush_all_caches_capable() will flush_all_caches() ever be called? Ira > +#else > +# define flush_all_caches_capable() true > +#endif > + > #endif /* _ASM_GENERIC_CACHEFLUSH_H */ > -- > 2.37.2 >
On Fri, 19 Aug 2022, Ira Weiny wrote: >Did you mean "must"? Yep. >> + * such as those which caches are in a consistent state. The >> + * caller can verify the situation early on. >> + */ >> +#ifndef flush_all_caches >> +# define flush_all_caches_capable() false >> +static inline void flush_all_caches(void) >> +{ >> + WARN_ON_ONCE("cache invalidation required\n"); >> +} > >With the addition of flush_all_caches_capable() will flush_all_caches() ever be >called? No, it should not. Hence you get a splat if you call it bogusly.
On Fri, Aug 19, 2022 at 10:10:24AM -0700, Davidlohr Bueso wrote: > index b192d917a6d0..ac4d4fd4e508 100644 > --- a/arch/x86/include/asm/cacheflush.h > +++ b/arch/x86/include/asm/cacheflush.h > @@ -10,4 +10,8 @@ > > void clflush_cache_range(void *addr, unsigned int size); > > +/* see comments in the stub version */ > +#define flush_all_caches() \ > + do { wbinvd_on_all_cpus(); } while(0) Yikes. This is just a horrible, horrible name and placement for a bad hack that should have no generic relevance. Please fix up the naming to make it clear that this function is for a very specific nvdimm use case, and move it to a nvdimm-specific header file.
On Sun, 21 Aug 2022, Christoph Hellwig wrote: >On Fri, Aug 19, 2022 at 10:10:24AM -0700, Davidlohr Bueso wrote: >> index b192d917a6d0..ac4d4fd4e508 100644 >> --- a/arch/x86/include/asm/cacheflush.h >> +++ b/arch/x86/include/asm/cacheflush.h >> @@ -10,4 +10,8 @@ >> >> void clflush_cache_range(void *addr, unsigned int size); >> >> +/* see comments in the stub version */ >> +#define flush_all_caches() \ >> + do { wbinvd_on_all_cpus(); } while(0) > >Yikes. This is just a horrible, horrible name and placement for a bad >hack that should have no generic relevance. Why does this have no generic relevance? There's already been discussions on how much wbinv is hated[0]. >Please fix up the naming to make it clear that this function is for a >very specific nvdimm use case, and move it to a nvdimm-specific header >file. Do you have any suggestions for a name? And, as the changelog describes, this is not nvdimm specific anymore, and the whole point of all this is volatile memory components for cxl, hence nvdimm namespace is bogus. [0] https://lore.kernel.org/all/Yvtc2u1J%2Fqip8za9@worktop.programming.kicks-ass.net/ Thanks, Davidlohr
Davidlohr Bueso wrote: > On Sun, 21 Aug 2022, Christoph Hellwig wrote: > > >On Fri, Aug 19, 2022 at 10:10:24AM -0700, Davidlohr Bueso wrote: > >> index b192d917a6d0..ac4d4fd4e508 100644 > >> --- a/arch/x86/include/asm/cacheflush.h > >> +++ b/arch/x86/include/asm/cacheflush.h > >> @@ -10,4 +10,8 @@ > >> > >> void clflush_cache_range(void *addr, unsigned int size); > >> > >> +/* see comments in the stub version */ > >> +#define flush_all_caches() \ > >> + do { wbinvd_on_all_cpus(); } while(0) > > > >Yikes. This is just a horrible, horrible name and placement for a bad > >hack that should have no generic relevance. > > Why does this have no generic relevance? There's already been discussions > on how much wbinv is hated[0]. > > >Please fix up the naming to make it clear that this function is for a > >very specific nvdimm use case, and move it to a nvdimm-specific header > >file. > > Do you have any suggestions for a name? And, as the changelog describes, > this is not nvdimm specific anymore, and the whole point of all this is > volatile memory components for cxl, hence nvdimm namespace is bogus. > > [0] https://lore.kernel.org/all/Yvtc2u1J%2Fqip8za9@worktop.programming.kicks-ass.net/ While it is not nvdimm specific anymore, it's still specific to "memory devices that can bulk invalidate a physical address space". I.e. it's not as generic as its location in arch/x86/include/asm/cacheflush.h would imply. So, similar to arch_invalidate_pmem(), lets keep it in a device-driver-specific header file, because hch and peterz are right, we need to make this much more clear that it is not for general consumption. There is already include/linux/memregion.h for identifying memory regions with a common id across ACPI NFIT, DAX "soft reserved" regions, and CXL. So how about something like along the same lines as arch_invalidate_pmem(): diff --git a/include/linux/memregion.h b/include/linux/memregion.h index c04c4fd2e209..0310135d7a42 100644 --- a/include/linux/memregion.h +++ b/include/linux/memregion.h @@ -21,3 +21,23 @@ static inline void memregion_free(int id) } #endif #endif /* _MEMREGION_H_ */ + +/* + * Device memory technologies like NVDIMM and CXL have events like + * secure erase and dynamic region provision that can invalidate an + * entire physical memory address range at once. Limit that + * functionality to architectures that have an efficient way to + * writeback and invalidate potentially terabytes of memory at once. + */ +#ifdef CONFIG_ARCH_HAS_MEMREGION_INVALIDATE +void arch_flush_memregion(phys_addr_t phys, resource_size_t size); +bool arch_has_flush_memregion(void); +#else +static inline bool arch_has_flush_memregion(void) +{ + return false; +} +static void arch_flush_memregion(phys_addr_t phys, resource_size_t size); +{ +} +#endif ...where arch_has_flush_memregion() on x86 is: bool arch_has_flush_memregion(void) { return !cpu_feature_enabled(X86_FEATURE_HYPERVISOR) } ...to make it clear that this API is unavailable in virtual environments.
On Mon, 22 Aug 2022, Dan Williams wrote: >Davidlohr Bueso wrote: >> On Sun, 21 Aug 2022, Christoph Hellwig wrote: >> >> >On Fri, Aug 19, 2022 at 10:10:24AM -0700, Davidlohr Bueso wrote: >> >> index b192d917a6d0..ac4d4fd4e508 100644 >> >> --- a/arch/x86/include/asm/cacheflush.h >> >> +++ b/arch/x86/include/asm/cacheflush.h >> >> @@ -10,4 +10,8 @@ >> >> >> >> void clflush_cache_range(void *addr, unsigned int size); >> >> >> >> +/* see comments in the stub version */ >> >> +#define flush_all_caches() \ >> >> + do { wbinvd_on_all_cpus(); } while(0) >> > >> >Yikes. This is just a horrible, horrible name and placement for a bad >> >hack that should have no generic relevance. >> >> Why does this have no generic relevance? There's already been discussions >> on how much wbinv is hated[0]. >> >> >Please fix up the naming to make it clear that this function is for a >> >very specific nvdimm use case, and move it to a nvdimm-specific header >> >file. >> >> Do you have any suggestions for a name? And, as the changelog describes, >> this is not nvdimm specific anymore, and the whole point of all this is >> volatile memory components for cxl, hence nvdimm namespace is bogus. >> >> [0] https://lore.kernel.org/all/Yvtc2u1J%2Fqip8za9@worktop.programming.kicks-ass.net/ > >While it is not nvdimm specific anymore, it's still specific to "memory >devices that can bulk invalidate a physical address space". I.e. it's >not as generic as its location in arch/x86/include/asm/cacheflush.h >would imply. So, similar to arch_invalidate_pmem(), lets keep it in a >device-driver-specific header file, because hch and peterz are right, we >need to make this much more clear that it is not for general >consumption. Fine, I won't argue - although I don't particularly agree, at least wrt the naming. Imo my naming does _exactly_ what it should do and is much easier to read than arch_has_flush_memregion() which is counter intuitive when we are in fact flushing everything. This does not either make it any more clearer about virt vs physical mappings either (except that it's no longer associated to cacheflush). But, excepting arm cacheflush.h's rare arch with braino cache users get way too much credit in their namespace usage. But yes there is no doubt that my version is more inviting than it should be, which made me think of naming it to flush_all_caches_careful() so the user is forced to at least check the function (or one would hope). Anyway, I'll send a new version based on the below - I particularly agree with the hypervisor bits. Thanks, Davidlohr
Davidlohr Bueso wrote: > On Mon, 22 Aug 2022, Dan Williams wrote: > > >Davidlohr Bueso wrote: > >> On Sun, 21 Aug 2022, Christoph Hellwig wrote: > >> > >> >On Fri, Aug 19, 2022 at 10:10:24AM -0700, Davidlohr Bueso wrote: > >> >> index b192d917a6d0..ac4d4fd4e508 100644 > >> >> --- a/arch/x86/include/asm/cacheflush.h > >> >> +++ b/arch/x86/include/asm/cacheflush.h > >> >> @@ -10,4 +10,8 @@ > >> >> > >> >> void clflush_cache_range(void *addr, unsigned int size); > >> >> > >> >> +/* see comments in the stub version */ > >> >> +#define flush_all_caches() \ > >> >> + do { wbinvd_on_all_cpus(); } while(0) > >> > > >> >Yikes. This is just a horrible, horrible name and placement for a bad > >> >hack that should have no generic relevance. > >> > >> Why does this have no generic relevance? There's already been discussions > >> on how much wbinv is hated[0]. > >> > >> >Please fix up the naming to make it clear that this function is for a > >> >very specific nvdimm use case, and move it to a nvdimm-specific header > >> >file. > >> > >> Do you have any suggestions for a name? And, as the changelog describes, > >> this is not nvdimm specific anymore, and the whole point of all this is > >> volatile memory components for cxl, hence nvdimm namespace is bogus. > >> > >> [0] https://lore.kernel.org/all/Yvtc2u1J%2Fqip8za9@worktop.programming.kicks-ass.net/ > > > >While it is not nvdimm specific anymore, it's still specific to "memory > >devices that can bulk invalidate a physical address space". I.e. it's > >not as generic as its location in arch/x86/include/asm/cacheflush.h > >would imply. So, similar to arch_invalidate_pmem(), lets keep it in a > >device-driver-specific header file, because hch and peterz are right, we > >need to make this much more clear that it is not for general > >consumption. > > Fine, I won't argue - although I don't particularly agree, at least wrt > the naming. Imo my naming does _exactly_ what it should do and is much > easier to read than arch_has_flush_memregion() which is counter intuitive > when we are in fact flushing everything. This does not either make it > any more clearer about virt vs physical mappings either (except that > it's no longer associated to cacheflush). But, excepting arm cacheflush.h's > rare arch with braino cache users get way too much credit in their namespace > usage. > > But yes there is no doubt that my version is more inviting than it should be, > which made me think of naming it to flush_all_caches_careful() so the user > is forced to at least check the function (or one would hope). So I'm not married to arch_has_flush_memregion() or even including the physical address range to flush, the only aspect of the prototype I want to see incorporated is something about the target / motivation for the flush. "flush_all_caches_careful()" says nothing about what the API is being "careful" about. It reminds of Linus' comments on memcpy_mcsafe() https://lore.kernel.org/all/CAHk-=wh1SPyuGkTkQESsacwKTpjWd=_-KwoCK5o=SuC3yMdf7A@mail.gmail.com/ "Naming - like comments - shouldn't be about what some implementation is, but about the concept." So "memregion" was meant to represent a memory device backed physical address range, but that association may only be in my own head. How about something even more explicit like: "flush_after_memdev_invalidate()" where someone would feel icky using it for anything other than what we have been talking about in this thread. > Anyway, I'll send a new version based on the below - I particularly agree > with the hypervisor bits. Ok, just one more lap around the bikeshed track, but I think we're converging.
diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h index b192d917a6d0..ac4d4fd4e508 100644 --- a/arch/x86/include/asm/cacheflush.h +++ b/arch/x86/include/asm/cacheflush.h @@ -10,4 +10,8 @@ void clflush_cache_range(void *addr, unsigned int size); +/* see comments in the stub version */ +#define flush_all_caches() \ + do { wbinvd_on_all_cpus(); } while(0) + #endif /* _ASM_X86_CACHEFLUSH_H */ diff --git a/drivers/acpi/nfit/intel.c b/drivers/acpi/nfit/intel.c index 8dd792a55730..f2f6c31e6ab7 100644 --- a/drivers/acpi/nfit/intel.c +++ b/drivers/acpi/nfit/intel.c @@ -4,6 +4,7 @@ #include <linux/ndctl.h> #include <linux/acpi.h> #include <asm/smp.h> +#include <linux/cacheflush.h> #include "intel.h" #include "nfit.h" @@ -190,8 +191,6 @@ static int intel_security_change_key(struct nvdimm *nvdimm, } } -static void nvdimm_invalidate_cache(void); - static int __maybe_unused intel_security_unlock(struct nvdimm *nvdimm, const struct nvdimm_key_data *key_data) { @@ -210,6 +209,9 @@ static int __maybe_unused intel_security_unlock(struct nvdimm *nvdimm, }; int rc; + if (!flush_all_caches_capable()) + return -EINVAL; + if (!test_bit(NVDIMM_INTEL_UNLOCK_UNIT, &nfit_mem->dsm_mask)) return -ENOTTY; @@ -228,7 +230,7 @@ static int __maybe_unused intel_security_unlock(struct nvdimm *nvdimm, } /* DIMM unlocked, invalidate all CPU caches before we read it */ - nvdimm_invalidate_cache(); + flush_all_caches(); return 0; } @@ -294,11 +296,14 @@ static int __maybe_unused intel_security_erase(struct nvdimm *nvdimm, }, }; + if (!flush_all_caches_capable()) + return -EINVAL; + if (!test_bit(cmd, &nfit_mem->dsm_mask)) return -ENOTTY; /* flush all cache before we erase DIMM */ - nvdimm_invalidate_cache(); + flush_all_caches(); memcpy(nd_cmd.cmd.passphrase, key->data, sizeof(nd_cmd.cmd.passphrase)); rc = nvdimm_ctl(nvdimm, ND_CMD_CALL, &nd_cmd, sizeof(nd_cmd), NULL); @@ -318,7 +323,7 @@ static int __maybe_unused intel_security_erase(struct nvdimm *nvdimm, } /* DIMM erased, invalidate all CPU caches before we read it */ - nvdimm_invalidate_cache(); + flush_all_caches(); return 0; } @@ -338,6 +343,9 @@ static int __maybe_unused intel_security_query_overwrite(struct nvdimm *nvdimm) }, }; + if (!flush_all_caches_capable()) + return -EINVAL; + if (!test_bit(NVDIMM_INTEL_QUERY_OVERWRITE, &nfit_mem->dsm_mask)) return -ENOTTY; @@ -355,7 +363,7 @@ static int __maybe_unused intel_security_query_overwrite(struct nvdimm *nvdimm) } /* flush all cache before we make the nvdimms available */ - nvdimm_invalidate_cache(); + flush_all_caches(); return 0; } @@ -377,11 +385,14 @@ static int __maybe_unused intel_security_overwrite(struct nvdimm *nvdimm, }, }; + if (!flush_all_caches_capable()) + return -EINVAL; + if (!test_bit(NVDIMM_INTEL_OVERWRITE, &nfit_mem->dsm_mask)) return -ENOTTY; /* flush all cache before we erase DIMM */ - nvdimm_invalidate_cache(); + flush_all_caches(); memcpy(nd_cmd.cmd.passphrase, nkey->data, sizeof(nd_cmd.cmd.passphrase)); rc = nvdimm_ctl(nvdimm, ND_CMD_CALL, &nd_cmd, sizeof(nd_cmd), NULL); @@ -401,22 +412,6 @@ static int __maybe_unused intel_security_overwrite(struct nvdimm *nvdimm, } } -/* - * TODO: define a cross arch wbinvd equivalent when/if - * NVDIMM_FAMILY_INTEL command support arrives on another arch. - */ -#ifdef CONFIG_X86 -static void nvdimm_invalidate_cache(void) -{ - wbinvd_on_all_cpus(); -} -#else -static void nvdimm_invalidate_cache(void) -{ - WARN_ON_ONCE("cache invalidation required after unlock\n"); -} -#endif - static const struct nvdimm_security_ops __intel_security_ops = { .get_flags = intel_security_flags, .freeze = intel_security_freeze, diff --git a/include/asm-generic/cacheflush.h b/include/asm-generic/cacheflush.h index 4f07afacbc23..89f310f92498 100644 --- a/include/asm-generic/cacheflush.h +++ b/include/asm-generic/cacheflush.h @@ -115,4 +115,35 @@ static inline void flush_cache_vunmap(unsigned long start, unsigned long end) memcpy(dst, src, len) #endif +/* + * Flush the entire caches across all CPUs, however: + * + * YOU DO NOT WANT TO USE THIS FUNCTION. + * + * It is considered a big hammer and can affect overall + * system performance and increase latency/response times. + * As such it is not for general usage, but for specific use + * cases involving instantaneously invalidating wide swaths + * of memory on bare metal. + + * Unlike the APIs above, this function can be defined on + * architectures which have VIPT or PIPT caches, and thus is + * beyond the scope of virtual to physical mappings/page + * tables changing. + * + * The limitation here is that the architectures that make + * use of it must can actually comply with the semantics, + * such as those which caches are in a consistent state. The + * caller can verify the situation early on. + */ +#ifndef flush_all_caches +# define flush_all_caches_capable() false +static inline void flush_all_caches(void) +{ + WARN_ON_ONCE("cache invalidation required\n"); +} +#else +# define flush_all_caches_capable() true +#endif + #endif /* _ASM_GENERIC_CACHEFLUSH_H */
With CXL security features, global CPU cache flushing nvdimm requirements are no longer specific to that subsystem, even beyond the scope of security_ops. CXL will need such semantics for features not necessarily limited to persistent memory. The functionality this is enabling is to be able to instantaneously secure erase potentially terabytes of memory at once and the kernel needs to be sure that none of the data from before the secure is still present in the cache. It is also used when unlocking a memory device where speculative reads and firmware accesses could have cached poison from before the device was unlocked. This capability is typically only used once per-boot (for unlock), or once per bare metal provisioning event (secure erase), like when handing off the system to another tenant or decomissioning a device. That small scope plus the fact that none of this is available to a VM limits the potential damage. While the scope of this is for physical address space, add a new flush_all_caches() in cacheflush headers such that each architecture can define it, when capable. For x86 just use the wbinvd hammer and prevent any other arch from being capable. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> --- Changes from v1 (https://lore.kernel.org/all/20220815160706.tqd42dv24tgb7x7y@offworld/): - Added comments and improved changelog to reflect this is routine should be avoided and not considered a general API (Peter, Dan). arch/x86/include/asm/cacheflush.h | 4 +++ drivers/acpi/nfit/intel.c | 41 ++++++++++++++----------------- include/asm-generic/cacheflush.h | 31 +++++++++++++++++++++++ 3 files changed, 53 insertions(+), 23 deletions(-)