Message ID | 20230609220104.1836988-1-oliver.upton@linux.dev (mailing list archive) |
---|---|
Headers | show |
Series | KVM: arm64: Work around Ampere1 erratum AC03_CPU_38 | expand |
On Fri, Jun 09, 2023 at 10:01:01PM +0000, Oliver Upton wrote: > Small series to work around a CPU erratum on AmpereOne. While the > implementation does not advertise support for FEAT_HAFDBS (due to > another erratum), the associated control bits do not have RES0 behavior > as required by the architecture. > > Usage of HAFDBS at stage-1 is unaffected, since HA and HD are only > enabled on implementations that advertise the feature. However, KVM > relies on HA having RES0 semantics if the feature isn't implemented. The > end result is that KVM enables a broken hardware access flag > implementation that could lead to correctness issues. Just curious, what's the correctness issue here? The access flag is mostly indicative of which pages are old for swapping out/discarding. It's not like the dirty state which would be dangerous if we get wrong.
Hey Catalin, On Wed, Jun 14, 2023 at 05:57:55PM +0100, Catalin Marinas wrote: > On Fri, Jun 09, 2023 at 10:01:01PM +0000, Oliver Upton wrote: > > Small series to work around a CPU erratum on AmpereOne. While the > > implementation does not advertise support for FEAT_HAFDBS (due to > > another erratum), the associated control bits do not have RES0 behavior > > as required by the architecture. > > > > Usage of HAFDBS at stage-1 is unaffected, since HA and HD are only > > enabled on implementations that advertise the feature. However, KVM > > relies on HA having RES0 semantics if the feature isn't implemented. The > > end result is that KVM enables a broken hardware access flag > > implementation that could lead to correctness issues. > > Just curious, what's the correctness issue here? The access flag is > mostly indicative of which pages are old for swapping out/discarding. > It's not like the dirty state which would be dangerous if we get wrong. I probably could have helped out by giving the full context. The software-observable behavior on this system is that the A or D updates could arrive after a PTE has been marked as invalid, which could corrupt software metadata stuffed into the page tables. We do exactly that at stage-2 in KVM for parallel fault handling, where a magic value indicates a PTE is being updated by another thread.
On Wed, Jun 14, 2023 at 11:06:40PM +0000, Oliver Upton wrote: > Hey Catalin, > > On Wed, Jun 14, 2023 at 05:57:55PM +0100, Catalin Marinas wrote: > > On Fri, Jun 09, 2023 at 10:01:01PM +0000, Oliver Upton wrote: > > > Small series to work around a CPU erratum on AmpereOne. While the > > > implementation does not advertise support for FEAT_HAFDBS (due to > > > another erratum), the associated control bits do not have RES0 behavior > > > as required by the architecture. > > > > > > Usage of HAFDBS at stage-1 is unaffected, since HA and HD are only > > > enabled on implementations that advertise the feature. However, KVM > > > relies on HA having RES0 semantics if the feature isn't implemented. The > > > end result is that KVM enables a broken hardware access flag > > > implementation that could lead to correctness issues. > > > > Just curious, what's the correctness issue here? The access flag is > > mostly indicative of which pages are old for swapping out/discarding. > > It's not like the dirty state which would be dangerous if we get wrong. > > I probably could have helped out by giving the full context. > > The software-observable behavior on this system is that the A or D > updates could arrive after a PTE has been marked as invalid, which could > corrupt software metadata stuffed into the page tables. We do exactly > that at stage-2 in KVM for parallel fault handling, where a magic value > indicates a PTE is being updated by another thread. Ah, ok, that's dangerous indeed. Thanks for the details (you may want to add them in the patch description or the erratum kconfig entry).
On Fri, 09 Jun 2023 23:01:01 +0100, Oliver Upton <oliver.upton@linux.dev> wrote: > > Hi folks, > > Small series to work around a CPU erratum on AmpereOne. While the > implementation does not advertise support for FEAT_HAFDBS (due to > another erratum), the associated control bits do not have RES0 behavior > as required by the architecture. > > Usage of HAFDBS at stage-1 is unaffected, since HA and HD are only > enabled on implementations that advertise the feature. However, KVM > relies on HA having RES0 semantics if the feature isn't implemented. The > end result is that KVM enables a broken hardware access flag > implementation that could lead to correctness issues. > > Applies to 6.4-rc1. Tested with access_tracking_perf_test, verifying > that KVM is indeed taking Access Flag faults. For the series: Reviewed-by: Marc Zyngier <maz@kernel.org> M.
On Fri, 9 Jun 2023 22:01:01 +0000, Oliver Upton wrote: > Small series to work around a CPU erratum on AmpereOne. While the > implementation does not advertise support for FEAT_HAFDBS (due to > another erratum), the associated control bits do not have RES0 behavior > as required by the architecture. > > Usage of HAFDBS at stage-1 is unaffected, since HA and HD are only > enabled on implementations that advertise the feature. However, KVM > relies on HA having RES0 semantics if the feature isn't implemented. The > end result is that KVM enables a broken hardware access flag > implementation that could lead to correctness issues. > > [...] Applied w/ an expanded description of what's wrong with the unadvertised HAFDBS implementation, per Catalin's suggestion. [1/3] arm64: errata: Mitigate Ampere1 erratum AC03_CPU_38 at stage-2 https://git.kernel.org/kvmarm/kvmarm/c/6df696cd9bc1 [2/3] KVM: arm64: Refactor HFGxTR configuration into separate helpers https://git.kernel.org/kvmarm/kvmarm/c/ce4a36225753 [3/3] KVM: arm64: Prevent guests from enabling HA/HD on Ampere1 https://git.kernel.org/kvmarm/kvmarm/c/082fdfd13841 -- Best, Oliver