Message ID | 20221209044557.1496580-3-robert.hu@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Linear Address Masking (LAM) KVM Enabling | expand |
On Fri, Dec 09, 2022, Robert Hoo wrote:
> If LAM enabled, CR4.LAM_SUP is owned by guest; otherwise, reserved.
Why is it passed through to the guest?
On Sat, 2023-01-07 at 00:38 +0000, Sean Christopherson wrote: > On Fri, Dec 09, 2022, Robert Hoo wrote: > > If LAM enabled, CR4.LAM_SUP is owned by guest; otherwise, reserved. > > Why is it passed through to the guest? I think no need to intercept guest's control over CR4.LAM_SUP, which controls LAM appliance to supervisor mode address. Same pass-through happens to CR3.LAM_U{48, 57} when EPT is effective.
On Sat, Jan 07, 2023, Robert Hoo wrote: > On Sat, 2023-01-07 at 00:38 +0000, Sean Christopherson wrote: > > On Fri, Dec 09, 2022, Robert Hoo wrote: > > > If LAM enabled, CR4.LAM_SUP is owned by guest; otherwise, reserved. > > > > Why is it passed through to the guest? > > I think no need to intercept guest's control over CR4.LAM_SUP, which > controls LAM appliance to supervisor mode address. That's not a sufficient justification. KVM doesn't strictly need to intercept most CR4 bits, but not intercepting has performance implications, e.g. KVM needs to do a VMREAD(GUEST_CR4) to get LAM_SUP if the bit is pass through. As a base rule, KVM intercepts CR4 bits unless there's a reason not to, e.g. if the CR4 bit in question is written frequently by real guests and/or never consumed by KVM.
On Mon, 2023-01-09 at 16:29 +0000, Sean Christopherson wrote: > On Sat, Jan 07, 2023, Robert Hoo wrote: > > On Sat, 2023-01-07 at 00:38 +0000, Sean Christopherson wrote: > > > On Fri, Dec 09, 2022, Robert Hoo wrote: > > > > If LAM enabled, CR4.LAM_SUP is owned by guest; otherwise, > > > > reserved. > > > > > > Why is it passed through to the guest? > > > > I think no need to intercept guest's control over CR4.LAM_SUP, > > which > > controls LAM appliance to supervisor mode address. > > That's not a sufficient justification. KVM doesn't strictly need to > intercept > most CR4 bits, Yeah, that's also my experiential understanding. > but not intercepting has performance implications, e.g. KVM needs > to do a VMREAD(GUEST_CR4) to get LAM_SUP if the bit is pass > through. Right. On the other hand, intercepting has performance penalty by vm- exit, and much heavier. So, trade-off, right? > As a base > rule, KVM intercepts CR4 bits unless there's a reason not to, e.g. if > the CR4 bit > in question is written frequently by real guests and/or never > consumed by KVM. From these 2 points to judge: CR4.LAM_SUP is written frequently by guest? I'm not sure, as native kernel enabling patch has LAM_U57 only yet, not sure its control will be per-process/thread or whole kernel-level. If it its use case is kasan kind of, would you expect it will be frequently guest written? Never consumed by KVM? false, e.g. kvm_untagged_addr() will read this bit. But not frequently, I think, at least by this patch set. So in general, you suggestion/preference? I'm all right on both choices.
On Tue, Jan 10, 2023, Robert Hoo wrote: > On Mon, 2023-01-09 at 16:29 +0000, Sean Christopherson wrote: > > As a base rule, KVM intercepts CR4 bits unless there's a reason not to, > > e.g. if the CR4 bit in question is written frequently by real guests and/or > > never consumed by KVM. > > From these 2 points to judge: > CR4.LAM_SUP is written frequently by guest? I'm not sure, as native > kernel enabling patch has LAM_U57 only yet, not sure its control will > be per-process/thread or whole kernel-level. If it its use case is > kasan kind of, would you expect it will be frequently guest written? Controlling a kernel-level knob on a per-process basis would be bizarre. But the expected use case definitely needs to be understood. I assume Kirill, or whoever is doing the LAM_SUP implementation, can provide answers. > Never consumed by KVM? false, e.g. kvm_untagged_addr() will read this > bit. But not frequently, I think, at least by this patch set. Untagging an address will need to be done any time KVM consumes a guest virtual address, i.e. performs any kind of emulation. That's not super high frequency on modern CPUs, but it's not exactly rare either. > So in general, you suggestion/preference? I'm all right on both > choices. Unless guests will be touching CR4.LAM_SUP on context switches, intercepting is unquestionably the right choice.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3c736e00b6b1..275a6b2337b1 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -120,7 +120,8 @@ | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \ | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \ | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \ - | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP)) + | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \ + | X86_CR4_LAM_SUP)) #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR) diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h index 3febc342360c..917f1b770839 100644 --- a/arch/x86/kvm/kvm_cache_regs.h +++ b/arch/x86/kvm/kvm_cache_regs.h @@ -7,7 +7,8 @@ #define KVM_POSSIBLE_CR0_GUEST_BITS X86_CR0_TS #define KVM_POSSIBLE_CR4_GUEST_BITS \ (X86_CR4_PVI | X86_CR4_DE | X86_CR4_PCE | X86_CR4_OSFXSR \ - | X86_CR4_OSXMMEXCPT | X86_CR4_PGE | X86_CR4_TSD | X86_CR4_FSGSBASE) + | X86_CR4_OSXMMEXCPT | X86_CR4_PGE | X86_CR4_TSD | X86_CR4_FSGSBASE \ + | X86_CR4_LAM_SUP) #define X86_CR0_PDPTR_BITS (X86_CR0_CD | X86_CR0_NW | X86_CR0_PG) #define X86_CR4_TLBFLUSH_BITS (X86_CR4_PGE | X86_CR4_PCIDE | X86_CR4_PAE | X86_CR4_SMEP) diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index d92e580768e5..6c1fbe27616f 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -474,6 +474,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type); __reserved_bits |= X86_CR4_VMXE; \ if (!__cpu_has(__c, X86_FEATURE_PCID)) \ __reserved_bits |= X86_CR4_PCIDE; \ + if (!__cpu_has(__c, X86_FEATURE_LAM)) \ + __reserved_bits |= X86_CR4_LAM_SUP; \ __reserved_bits; \ })