diff mbox series

[v4,7/9] KVM: x86: When guest set CR3, handle LAM bits semantics

Message ID 20230209024022.3371768-8-robert.hu@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series Linear Address Masking (LAM) KVM Enabling | expand

Commit Message

Robert Hoo Feb. 9, 2023, 2:40 a.m. UTC
When only changes LAM bits, ask next vcpu run to load mmu pgd, so that it
will build new CR3 with LAM bits updates.
When changes on CR3's effective address bits, no matter LAM bits changes or not,
go through normal pgd update process.

Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
---
 arch/x86/kvm/x86.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

Comments

Chao Gao Feb. 13, 2023, 3:31 a.m. UTC | #1
On Thu, Feb 09, 2023 at 10:40:20AM +0800, Robert Hoo wrote:
>When only changes LAM bits, ask next vcpu run to load mmu pgd, so that it
>will build new CR3 with LAM bits updates.
>When changes on CR3's effective address bits, no matter LAM bits changes or not,
>go through normal pgd update process.

Please squash this into patch 2.

>
>Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
>Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
>---
> arch/x86/kvm/x86.c | 24 ++++++++++++++++++++----
> 1 file changed, 20 insertions(+), 4 deletions(-)
>
>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>index 3218f465ae71..7df6c9dd12a5 100644
>--- a/arch/x86/kvm/x86.c
>+++ b/arch/x86/kvm/x86.c
>@@ -1242,9 +1242,9 @@ static bool kvm_is_valid_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> {
> 	bool skip_tlb_flush = false;
>-	unsigned long pcid = 0;
>+	unsigned long pcid = 0, old_cr3;
> #ifdef CONFIG_X86_64

>-	bool pcid_enabled = kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
>+	bool pcid_enabled = !!kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);

This change isn't related. Please drop it or post it separately.

> 
> 	if (pcid_enabled) {
> 		skip_tlb_flush = cr3 & X86_CR3_PCID_NOFLUSH;
>@@ -1257,6 +1257,10 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> 	if (cr3 == kvm_read_cr3(vcpu) && !is_pae_paging(vcpu))
> 		goto handle_tlb_flush;
> 
>+	if (!guest_cpuid_has(vcpu, X86_FEATURE_LAM) &&
>+	    (cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57)))
>+		return	1;

can you move this check to kvm_vcpu_is_valid_cr3(), i.e., return false in
that function if any LAM bit is toggled while LAM isn't exposed to the guest?

>+
> 	/*
> 	 * Do not condition the GPA check on long mode, this helper is used to
> 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee that
>@@ -1268,8 +1272,20 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
> 		return 1;
> 
>-	if (cr3 != kvm_read_cr3(vcpu))
>-		kvm_mmu_new_pgd(vcpu, cr3);
>+	old_cr3 = kvm_read_cr3(vcpu);
>+	if (cr3 != old_cr3) {
>+		if ((cr3 ^ old_cr3) & CR3_ADDR_MASK) {

Does this check against CR3_ADDR_MASK necessarily mean LAM bits are
toggled, i.e., CR3_ADDR_MASK == ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57)?

Why not check if LAM bits are changed? This way the patch only changes
cases related to LAM, keeping other cases intact.

>+			kvm_mmu_new_pgd(vcpu, cr3 & ~(X86_CR3_LAM_U48 |
>+					X86_CR3_LAM_U57));

Do you need to touch kvm_mmu_new_pgd() in nested_vmx_load_cr3()?

>+		} else {
>+			/*
>+			 * Though effective addr no change, mark the
>+			 * request so that LAM bits will take effect
>+			 * when enter guest.
>+			 */
>+			kvm_make_request(KVM_REQ_LOAD_MMU_PGD, vcpu);
>+		}
>+	}
> 
> 	vcpu->arch.cr3 = cr3;
> 	kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);
>-- 
>2.31.1
>
Robert Hoo Feb. 14, 2023, 5:28 a.m. UTC | #2
On Mon, 2023-02-13 at 11:31 +0800, Chao Gao wrote:
> On Thu, Feb 09, 2023 at 10:40:20AM +0800, Robert Hoo wrote:
> > When only changes LAM bits, ask next vcpu run to load mmu pgd, so
> > that it
> > will build new CR3 with LAM bits updates.
> > When changes on CR3's effective address bits, no matter LAM bits
> > changes or not,
> > go through normal pgd update process.
> 
> Please squash this into patch 2.
> 
Though all surround CR3, I would prefer split into pieces, so that
easier for review and accept. I can change their order to group
together. Is it all right for you?
> > 
> > -	bool pcid_enabled = kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
> > +	bool pcid_enabled = !!kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
> 
> This change isn't related. Please drop it or post it separately.
> 
Yu commented this also on last version. I thought it's too trivial to
be separated.
Now that both of you suggest this. Let me split it in a separated patch
in this set. Is this all right?
I do think separating it in another patch set alone, is too trivial.
> > 
> > 	if (pcid_enabled) {
> > 		skip_tlb_flush = cr3 & X86_CR3_PCID_NOFLUSH;
> > @@ -1257,6 +1257,10 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu,
> > unsigned long cr3)
> > 	if (cr3 == kvm_read_cr3(vcpu) && !is_pae_paging(vcpu))
> > 		goto handle_tlb_flush;
> > 
> > +	if (!guest_cpuid_has(vcpu, X86_FEATURE_LAM) &&
> > +	    (cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57)))
> > +		return	1;
> 
> can you move this check to kvm_vcpu_is_valid_cr3(), i.e., return
> false in
> that function if any LAM bit is toggled while LAM isn't exposed to
> the guest?
> 
OK
> > +
> > 	/*
> > 	 * Do not condition the GPA check on long mode, this helper is
> > used to
> > 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee
> > that
> > @@ -1268,8 +1272,20 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu,
> > unsigned long cr3)
> > 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
> > 		return 1;
> > 
> > -	if (cr3 != kvm_read_cr3(vcpu))
> > -		kvm_mmu_new_pgd(vcpu, cr3);
> > +	old_cr3 = kvm_read_cr3(vcpu);
> > +	if (cr3 != old_cr3) {
> > +		if ((cr3 ^ old_cr3) & CR3_ADDR_MASK) {
> 
This means those effective addr bits changes, then no matter LAM bits
toggled or not, it needs new pgd.

> Does this check against CR3_ADDR_MASK necessarily mean LAM bits are
> toggled, i.e., CR3_ADDR_MASK == ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57)?
> 
> Why not check if LAM bits are changed? This way the patch only
> changes
> cases related to LAM, keeping other cases intact.

Yes, I can better to add check in "else" that LAM bits changes.
But in fact above kvm_is_valid_cr3() has guaranteed no other high order
bits changed.
Emm, now you might ask to melt LAM bits into vcpu-
>arch.reserved_gpa_bits? ;)
> 
> > +			kvm_mmu_new_pgd(vcpu, cr3 & ~(X86_CR3_LAM_U48 |
> > +					X86_CR3_LAM_U57));
> 
> Do you need to touch kvm_mmu_new_pgd() in nested_vmx_load_cr3()?

Didn't scope nested LAM case in this patch set.
Chao Gao Feb. 14, 2023, 6:48 a.m. UTC | #3
On Tue, Feb 14, 2023 at 01:28:33PM +0800, Robert Hoo wrote:
>> > +
>> > 	/*
>> > 	 * Do not condition the GPA check on long mode, this helper is
>> > used to
>> > 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee
>> > that
>> > @@ -1268,8 +1272,20 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu,
>> > unsigned long cr3)
>> > 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
>> > 		return 1;
>> > 
>> > -	if (cr3 != kvm_read_cr3(vcpu))
>> > -		kvm_mmu_new_pgd(vcpu, cr3);
>> > +	old_cr3 = kvm_read_cr3(vcpu);
>> > +	if (cr3 != old_cr3) {
>> > +		if ((cr3 ^ old_cr3) & CR3_ADDR_MASK) {
>> 
>This means those effective addr bits changes, then no matter LAM bits
>toggled or not, it needs new pgd.
>
>> Does this check against CR3_ADDR_MASK necessarily mean LAM bits are
>> toggled, i.e., CR3_ADDR_MASK == ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57)?
>> 
>> Why not check if LAM bits are changed? This way the patch only
>> changes
>> cases related to LAM, keeping other cases intact.
>
>Yes, I can better to add check in "else" that LAM bits changes.
>But in fact above kvm_is_valid_cr3() has guaranteed no other high order
>bits changed.
>Emm, now you might ask to melt LAM bits into vcpu-
>>arch.reserved_gpa_bits? ;)

no. I am not asking for that.

My point is for example, bit X isn't in CR3_ADDR_MASK. then toggling
the bit X will go into the else{} branch, which is particularly for LAM
bits. So, the change is correct iff

	CR3_ADDR_MASK = ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57).

I didn't check if that is true on your code base. If it isn't, replace
CR3_ADDR_MASK with ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57).

>> 
>> > +			kvm_mmu_new_pgd(vcpu, cr3 & ~(X86_CR3_LAM_U48 |
>> > +					X86_CR3_LAM_U57));
>> 
>> Do you need to touch kvm_mmu_new_pgd() in nested_vmx_load_cr3()?
>
>Didn't scope nested LAM case in this patch set.

Is there any justificaiton for not considering nested virtualization?
Won't nested virtualization be broken by this series?
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3218f465ae71..7df6c9dd12a5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1242,9 +1242,9 @@  static bool kvm_is_valid_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 {
 	bool skip_tlb_flush = false;
-	unsigned long pcid = 0;
+	unsigned long pcid = 0, old_cr3;
 #ifdef CONFIG_X86_64
-	bool pcid_enabled = kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
+	bool pcid_enabled = !!kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
 
 	if (pcid_enabled) {
 		skip_tlb_flush = cr3 & X86_CR3_PCID_NOFLUSH;
@@ -1257,6 +1257,10 @@  int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 	if (cr3 == kvm_read_cr3(vcpu) && !is_pae_paging(vcpu))
 		goto handle_tlb_flush;
 
+	if (!guest_cpuid_has(vcpu, X86_FEATURE_LAM) &&
+	    (cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57)))
+		return	1;
+
 	/*
 	 * Do not condition the GPA check on long mode, this helper is used to
 	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee that
@@ -1268,8 +1272,20 @@  int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
 		return 1;
 
-	if (cr3 != kvm_read_cr3(vcpu))
-		kvm_mmu_new_pgd(vcpu, cr3);
+	old_cr3 = kvm_read_cr3(vcpu);
+	if (cr3 != old_cr3) {
+		if ((cr3 ^ old_cr3) & CR3_ADDR_MASK) {
+			kvm_mmu_new_pgd(vcpu, cr3 & ~(X86_CR3_LAM_U48 |
+					X86_CR3_LAM_U57));
+		} else {
+			/*
+			 * Though effective addr no change, mark the
+			 * request so that LAM bits will take effect
+			 * when enter guest.
+			 */
+			kvm_make_request(KVM_REQ_LOAD_MMU_PGD, vcpu);
+		}
+	}
 
 	vcpu->arch.cr3 = cr3;
 	kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);