diff mbox

[v4,2/5] KVM: x86: Add IBPB support

Message ID 1517404231-22406-3-git-send-email-karahmed@amazon.de (mailing list archive)
State New, archived
Headers show

Commit Message

KarimAllah Ahmed Jan. 31, 2018, 1:10 p.m. UTC
From: Ashok Raj <ashok.raj@intel.com>

Add MSR passthrough for MSR_IA32_PRED_CMD and place branch predictor
barriers on switching between VMs to avoid inter VM Spectre-v2 attacks.

[peterz: rebase and changelog rewrite]
[karahmed: - rebase
           - vmx: expose PRED_CMD if guest has it in CPUID
           - svm: only pass through IBPB if guest has it in CPUID
           - vmx: support !cpu_has_vmx_msr_bitmap()]
           - vmx: support nested]
[dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS)
        PRED_CMD is a write-only MSR]

Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
---
 arch/x86/kvm/cpuid.c | 11 ++++++++++-
 arch/x86/kvm/svm.c   | 27 +++++++++++++++++++++++++++
 arch/x86/kvm/vmx.c   | 31 ++++++++++++++++++++++++++++++-
 3 files changed, 67 insertions(+), 2 deletions(-)

Comments

Jim Mattson Jan. 31, 2018, 4:50 p.m. UTC | #1
On Wed, Jan 31, 2018 at 5:10 AM, KarimAllah Ahmed <karahmed@amazon.de> wrote:

> +               vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD,
> +                                             MSR_TYPE_W);

Why not disable this intercept eagerly, rather than lazily? Unlike
MSR_IA32_SPEC_CTRL, there is no guest value to save/restore, so there
is no cost to disabling the intercept if the guest cpuid info declares
support for it.


> +       if (to_vmx(vcpu)->save_spec_ctrl_on_exit) {
> +               nested_vmx_disable_intercept_for_msr(
> +                               msr_bitmap_l1, msr_bitmap_l0,
> +                               MSR_IA32_PRED_CMD,
> +                               MSR_TYPE_R);
> +       }

I don't think this should be predicated on
"to_vmx(vcpu)->save_spec_ctrl_on_exit." Why not just
"guest_cpuid_has(vcpu, X86_FEATURE_IBPB)"? Also, the final argument to
nested_vmx_disable_intercept_for_msr should be MSR_TYPE_W rather than
MSR_TYPE_R.
Paolo Bonzini Jan. 31, 2018, 4:55 p.m. UTC | #2
On 31/01/2018 11:50, Jim Mattson wrote:
>> +       if (to_vmx(vcpu)->save_spec_ctrl_on_exit) {
>> +               nested_vmx_disable_intercept_for_msr(
>> +                               msr_bitmap_l1, msr_bitmap_l0,
>> +                               MSR_IA32_PRED_CMD,
>> +                               MSR_TYPE_R);
>> +       }
> I don't think this should be predicated on
> "to_vmx(vcpu)->save_spec_ctrl_on_exit." Why not just
> "guest_cpuid_has(vcpu, X86_FEATURE_IBPB)"? Also, the final argument to
> nested_vmx_disable_intercept_for_msr should be MSR_TYPE_W rather than
> MSR_TYPE_R.

In fact this MSR can even be passed down unconditionally, since it needs
no save/restore and has no ill performance effect on the sibling
hyperthread.

Only MSR_IA32_SPEC_CTRL needs to be conditional on
"to_vmx(vcpu)->save_spec_ctrl_on_exit".

Paolo
KarimAllah Ahmed Jan. 31, 2018, 5:11 p.m. UTC | #3
On 01/31/2018 05:50 PM, Jim Mattson wrote:
> On Wed, Jan 31, 2018 at 5:10 AM, KarimAllah Ahmed <karahmed@amazon.de> wrote:
> 
>> +               vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD,
>> +                                             MSR_TYPE_W);
> 
> Why not disable this intercept eagerly, rather than lazily? Unlike
> MSR_IA32_SPEC_CTRL, there is no guest value to save/restore, so there
> is no cost to disabling the intercept if the guest cpuid info declares
> support for it.
> 
> 
>> +       if (to_vmx(vcpu)->save_spec_ctrl_on_exit) {
>> +               nested_vmx_disable_intercept_for_msr(
>> +                               msr_bitmap_l1, msr_bitmap_l0,
>> +                               MSR_IA32_PRED_CMD,
>> +                               MSR_TYPE_R);
>> +       }
> 
> I don't think this should be predicated on
> "to_vmx(vcpu)->save_spec_ctrl_on_exit." Why not just
> "guest_cpuid_has(vcpu, X86_FEATURE_IBPB)"?

Paolo suggested this on the previous revision because guest_cpuid_has()
would be slow.

> Also, the final argument to
> nested_vmx_disable_intercept_for_msr should be MSR_TYPE_W rather than
> MSR_TYPE_R.
> 
Oops! will fix!
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
Paolo Bonzini Jan. 31, 2018, 5:15 p.m. UTC | #4
On 31/01/2018 12:11, KarimAllah Ahmed wrote:
> On 01/31/2018 05:50 PM, Jim Mattson wrote:
>> On Wed, Jan 31, 2018 at 5:10 AM, KarimAllah Ahmed <karahmed@amazon.de>
>> wrote:
>>
>>> +               vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap,
>>> MSR_IA32_PRED_CMD,
>>> +                                             MSR_TYPE_W);
>>
>> Why not disable this intercept eagerly, rather than lazily? Unlike
>> MSR_IA32_SPEC_CTRL, there is no guest value to save/restore, so there
>> is no cost to disabling the intercept if the guest cpuid info declares
>> support for it.
>>
>>
>>> +       if (to_vmx(vcpu)->save_spec_ctrl_on_exit) {
>>> +               nested_vmx_disable_intercept_for_msr(
>>> +                               msr_bitmap_l1, msr_bitmap_l0,
>>> +                               MSR_IA32_PRED_CMD,
>>> +                               MSR_TYPE_R);
>>> +       }
>>
>> I don't think this should be predicated on
>> "to_vmx(vcpu)->save_spec_ctrl_on_exit." Why not just
>> "guest_cpuid_has(vcpu, X86_FEATURE_IBPB)"?
> 
> Paolo suggested this on the previous revision because guest_cpuid_has()
> would be slow.

Sorry, that was for spec_ctrl.  Here there's no need to do any kind of
conditional check.

Paolo

>> Also, the final argument to
>> nested_vmx_disable_intercept_for_msr should be MSR_TYPE_W rather than
>> MSR_TYPE_R.
>>
> Oops! will fix!
> Amazon Development Center Germany GmbH
> Berlin - Dresden - Aachen
> main office: Krausenstr. 38, 10117 Berlin
> Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
> Ust-ID: DE289237879
> Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
KarimAllah Ahmed Jan. 31, 2018, 5:17 p.m. UTC | #5
On 01/31/2018 05:55 PM, Paolo Bonzini wrote:
> On 31/01/2018 11:50, Jim Mattson wrote:
>>> +       if (to_vmx(vcpu)->save_spec_ctrl_on_exit) {
>>> +               nested_vmx_disable_intercept_for_msr(
>>> +                               msr_bitmap_l1, msr_bitmap_l0,
>>> +                               MSR_IA32_PRED_CMD,
>>> +                               MSR_TYPE_R);
>>> +       }
>> I don't think this should be predicated on
>> "to_vmx(vcpu)->save_spec_ctrl_on_exit." Why not just
>> "guest_cpuid_has(vcpu, X86_FEATURE_IBPB)"? Also, the final argument to
>> nested_vmx_disable_intercept_for_msr should be MSR_TYPE_W rather than
>> MSR_TYPE_R.
> 
> In fact this MSR can even be passed down unconditionally, since it needs
> no save/restore and has no ill performance effect on the sibling
> hyperthread.
> 
> Only MSR_IA32_SPEC_CTRL needs to be conditional on
> "to_vmx(vcpu)->save_spec_ctrl_on_exit".

That used to be the case in an earlier version. There seems to be two
opinions here:

1) Pass it only if CPUID for the guest has it.
2) Pass it unconditionally.

I do not really have a preference.


> 
> Paolo
> 
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
Paolo Bonzini Jan. 31, 2018, 5:32 p.m. UTC | #6
Looking at SVM now...

On 31/01/2018 08:10, KarimAllah Ahmed wrote:
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index f40d0da..89495cf 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -529,6 +529,7 @@ struct svm_cpu_data {
>  	struct kvm_ldttss_desc *tss_desc;
>  
>  	struct page *save_area;
> +	struct vmcb *current_vmcb;
>  };
>  
>  static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
> @@ -1703,11 +1704,17 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
>  	__free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
>  	kvm_vcpu_uninit(vcpu);
>  	kmem_cache_free(kvm_vcpu_cache, svm);
> +	/*
> +	 * The vmcb page can be recycled, causing a false negative in
> +	 * svm_vcpu_load(). So do a full IBPB now.
> +	 */
> +	indirect_branch_prediction_barrier();
>  }
>  
>  static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  {
>  	struct vcpu_svm *svm = to_svm(vcpu);
> +	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
>  	int i;
>  
>  	if (unlikely(cpu != vcpu->cpu)) {
> @@ -1736,6 +1743,10 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  	if (static_cpu_has(X86_FEATURE_RDTSCP))
>  		wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
>  
> +	if (sd->current_vmcb != svm->vmcb) {
> +		sd->current_vmcb = svm->vmcb;
> +		indirect_branch_prediction_barrier();
> +	}
>  	avic_vcpu_load(vcpu, cpu);
>  }
>  
> @@ -3684,6 +3695,22 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  	case MSR_IA32_TSC:
>  		kvm_write_tsc(vcpu, msr);
>  		break;
> +	case MSR_IA32_PRED_CMD:
> +		if (!msr->host_initiated &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_IBPB))
> +			return 1;
> +
> +		if (data & ~PRED_CMD_IBPB)
> +			return 1;
> +
> +		if (!data)
> +			break;
> +
> +		wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB);

This is missing an

        { .index = MSR_IA32_PRED_CMD,                   .always = false },

in the direct_access_msrs array.  Once you do this, nested is taken care of
automatically.

Likewise for MSR_IA32_SPEC_CTRL in patch 5.

Paolo

> +		if (is_guest_mode(vcpu))
> +			break;
> +		set_msr_interception(svm->msrpm, MSR_IA32_PRED_CMD, 0, 1);
> +		break;
>  	case MSR_STAR:
>  		svm->vmcb->save.star = data;
>  		break;
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index d46a61b..96e672e 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2285,6 +2285,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  	if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
>  		per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
>  		vmcs_load(vmx->loaded_vmcs->vmcs);
> +		indirect_branch_prediction_barrier();
>  	}
>  
>  	if (!already_loaded) {
> @@ -3342,6 +3343,25 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_IA32_TSC:
>  		kvm_write_tsc(vcpu, msr_info);
>  		break;
> +	case MSR_IA32_PRED_CMD:
> +		if (!msr_info->host_initiated &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_IBPB))
> +			return 1;
> +
> +		if (data & ~PRED_CMD_IBPB)
> +			return 1;
> +
> +		if (!data)
> +			break;
> +
> +		wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB);
> +
> +		if (is_guest_mode(vcpu))
> +			break;
> +
> +		vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD,
> +					      MSR_TYPE_W);
> +		break;
>  	case MSR_IA32_CR_PAT:
>  		if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
>  			if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
> @@ -10046,7 +10066,8 @@ static inline bool nested_vmx_merge_msr_bitmap(struct kvm_vcpu *vcpu,
>  	unsigned long *msr_bitmap_l0 = to_vmx(vcpu)->nested.vmcs02.msr_bitmap;
>  
>  	/* This shortcut is ok because we support only x2APIC MSRs so far. */
> -	if (!nested_cpu_has_virt_x2apic_mode(vmcs12))
> +	if (!nested_cpu_has_virt_x2apic_mode(vmcs12) &&
> +	    !to_vmx(vcpu)->save_spec_ctrl_on_exit)
>  		return false;
>  
>  	page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->msr_bitmap);
> @@ -10079,6 +10100,14 @@ static inline bool nested_vmx_merge_msr_bitmap(struct kvm_vcpu *vcpu,
>  				MSR_TYPE_W);
>  		}
>  	}
> +
> +	if (to_vmx(vcpu)->save_spec_ctrl_on_exit) {
> +		nested_vmx_disable_intercept_for_msr(
> +				msr_bitmap_l1, msr_bitmap_l0,
> +				MSR_IA32_PRED_CMD,
> +				MSR_TYPE_R);
> +	}
> +
>  	kunmap(page);
>  	kvm_release_page_clean(page);
>  
>
Jim Mattson Jan. 31, 2018, 5:39 p.m. UTC | #7
On Wed, Jan 31, 2018 at 8:55 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:

> In fact this MSR can even be passed down unconditionally, since it needs
> no save/restore and has no ill performance effect on the sibling
> hyperthread.

I'm a bit surprised to hear that IBPB has no ill performance impact on
the sibling hyperthread. On current CPUs, this has to flush the BTB,
doesn't it? And since the BTB is shared between hyperthreads, doesn't
the sibling lose all of its branch predictions?
Van De Ven, Arjan Jan. 31, 2018, 6:04 p.m. UTC | #8
> On Wed, Jan 31, 2018 at 8:55 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:

> 

> > In fact this MSR can even be passed down unconditionally, since it needs

> > no save/restore and has no ill performance effect on the sibling

> > hyperthread.

> 

> I'm a bit surprised to hear that IBPB has no ill performance impact on

> the sibling hyperthread. On current CPUs, this has to flush the BTB,

> doesn't it? And since the BTB is shared between hyperthreads, doesn't

> the sibling lose all of its branch predictions?


IBPB most definitely (in current implementations) will stop both hyperthreads for
the flush.

IBPB is not a cheap operation on a system level
Paolo Bonzini Jan. 31, 2018, 6:53 p.m. UTC | #9
On 31/01/2018 12:39, Jim Mattson wrote:
> On Wed, Jan 31, 2018 at 8:55 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>> In fact this MSR can even be passed down unconditionally, since it needs
>> no save/restore and has no ill performance effect on the sibling
>> hyperthread.
> 
> I'm a bit surprised to hear that IBPB has no ill performance impact on
> the sibling hyperthread. On current CPUs, this has to flush the BTB,
> doesn't it? And since the BTB is shared between hyperthreads, doesn't
> the sibling lose all of its branch predictions?

Yeah, I knew about that, but I hadn't heard yet that it also blocks the
hyperthread while you do the write.  It makes sense in retrospect.

In any case, there's no difference (unlike IBRS) in the vmexit cost if
the MSR is passed through.

Paolo
diff mbox

Patch

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index c0eb337..033004d 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -365,6 +365,10 @@  static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		F(3DNOWPREFETCH) | F(OSVW) | 0 /* IBS */ | F(XOP) |
 		0 /* SKINIT, WDT, LWP */ | F(FMA4) | F(TBM);
 
+	/* cpuid 0x80000008.ebx */
+	const u32 kvm_cpuid_8000_0008_ebx_x86_features =
+		F(IBPB);
+
 	/* cpuid 0xC0000001.edx */
 	const u32 kvm_cpuid_C000_0001_edx_x86_features =
 		F(XSTORE) | F(XSTORE_EN) | F(XCRYPT) | F(XCRYPT_EN) |
@@ -625,7 +629,12 @@  static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		if (!g_phys_as)
 			g_phys_as = phys_as;
 		entry->eax = g_phys_as | (virt_as << 8);
-		entry->ebx = entry->edx = 0;
+		entry->edx = 0;
+		/* IBPB isn't necessarily present in hardware cpuid */
+		if (boot_cpu_has(X86_FEATURE_IBPB))
+			entry->ebx |= F(IBPB);
+		entry->ebx &= kvm_cpuid_8000_0008_ebx_x86_features;
+		cpuid_mask(&entry->ebx, CPUID_8000_0008_EBX);
 		break;
 	}
 	case 0x80000019:
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index f40d0da..89495cf 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -529,6 +529,7 @@  struct svm_cpu_data {
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
+	struct vmcb *current_vmcb;
 };
 
 static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
@@ -1703,11 +1704,17 @@  static void svm_free_vcpu(struct kvm_vcpu *vcpu)
 	__free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
 	kvm_vcpu_uninit(vcpu);
 	kmem_cache_free(kvm_vcpu_cache, svm);
+	/*
+	 * The vmcb page can be recycled, causing a false negative in
+	 * svm_vcpu_load(). So do a full IBPB now.
+	 */
+	indirect_branch_prediction_barrier();
 }
 
 static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
 	int i;
 
 	if (unlikely(cpu != vcpu->cpu)) {
@@ -1736,6 +1743,10 @@  static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	if (static_cpu_has(X86_FEATURE_RDTSCP))
 		wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
 
+	if (sd->current_vmcb != svm->vmcb) {
+		sd->current_vmcb = svm->vmcb;
+		indirect_branch_prediction_barrier();
+	}
 	avic_vcpu_load(vcpu, cpu);
 }
 
@@ -3684,6 +3695,22 @@  static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 	case MSR_IA32_TSC:
 		kvm_write_tsc(vcpu, msr);
 		break;
+	case MSR_IA32_PRED_CMD:
+		if (!msr->host_initiated &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_IBPB))
+			return 1;
+
+		if (data & ~PRED_CMD_IBPB)
+			return 1;
+
+		if (!data)
+			break;
+
+		wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB);
+		if (is_guest_mode(vcpu))
+			break;
+		set_msr_interception(svm->msrpm, MSR_IA32_PRED_CMD, 0, 1);
+		break;
 	case MSR_STAR:
 		svm->vmcb->save.star = data;
 		break;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index d46a61b..96e672e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2285,6 +2285,7 @@  static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
 		per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
 		vmcs_load(vmx->loaded_vmcs->vmcs);
+		indirect_branch_prediction_barrier();
 	}
 
 	if (!already_loaded) {
@@ -3342,6 +3343,25 @@  static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_TSC:
 		kvm_write_tsc(vcpu, msr_info);
 		break;
+	case MSR_IA32_PRED_CMD:
+		if (!msr_info->host_initiated &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_IBPB))
+			return 1;
+
+		if (data & ~PRED_CMD_IBPB)
+			return 1;
+
+		if (!data)
+			break;
+
+		wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB);
+
+		if (is_guest_mode(vcpu))
+			break;
+
+		vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD,
+					      MSR_TYPE_W);
+		break;
 	case MSR_IA32_CR_PAT:
 		if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
 			if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
@@ -10046,7 +10066,8 @@  static inline bool nested_vmx_merge_msr_bitmap(struct kvm_vcpu *vcpu,
 	unsigned long *msr_bitmap_l0 = to_vmx(vcpu)->nested.vmcs02.msr_bitmap;
 
 	/* This shortcut is ok because we support only x2APIC MSRs so far. */
-	if (!nested_cpu_has_virt_x2apic_mode(vmcs12))
+	if (!nested_cpu_has_virt_x2apic_mode(vmcs12) &&
+	    !to_vmx(vcpu)->save_spec_ctrl_on_exit)
 		return false;
 
 	page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->msr_bitmap);
@@ -10079,6 +10100,14 @@  static inline bool nested_vmx_merge_msr_bitmap(struct kvm_vcpu *vcpu,
 				MSR_TYPE_W);
 		}
 	}
+
+	if (to_vmx(vcpu)->save_spec_ctrl_on_exit) {
+		nested_vmx_disable_intercept_for_msr(
+				msr_bitmap_l1, msr_bitmap_l0,
+				MSR_IA32_PRED_CMD,
+				MSR_TYPE_R);
+	}
+
 	kunmap(page);
 	kvm_release_page_clean(page);