diff mbox series

[3/4] KVM: nVMX: Set vmcs02->vpid to vmcs12->vpid if L1 uses EPT

Message ID 20180906133228.118282-4-liran.alon@oracle.com (mailing list archive)
State New, archived
Headers show
Series : KVM: nVMX: Consider TLB are tagged with different EPTP if L1 uses EPT | expand

Commit Message

Liran Alon Sept. 6, 2018, 1:32 p.m. UTC
If CPU use both VPID and EPT, TLB entries populated by CPU are tagged
with both EPTP and VPID. Therefore, if L1 uses EPT, L2 TLB entries
are separated from L1 TLB entries by the EPTP tags as vmcs02 use
EPTP02 while vmcs01 use EPTP01.

Thus, we don't need to make sure that vmcs02->vpid != vmcs01->vpid.
Therefore, we can just set vmcs02->vpid to vmcs12->vpid.

Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
---
 arch/x86/kvm/vmx.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

Comments

Jim Mattson Sept. 6, 2018, 4:06 p.m. UTC | #1
On Thu, Sep 6, 2018 at 6:32 AM, Liran Alon <liran.alon@oracle.com> wrote:
> If CPU use both VPID and EPT, TLB entries populated by CPU are tagged
> with both EPTP and VPID. Therefore, if L1 uses EPT, L2 TLB entries
> are separated from L1 TLB entries by the EPTP tags as vmcs02 use
> EPTP02 while vmcs01 use EPTP01.
>
> Thus, we don't need to make sure that vmcs02->vpid != vmcs01->vpid.
> Therefore, we can just set vmcs02->vpid to vmcs12->vpid.
>
> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
> Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
> Signed-off-by: Liran Alon <liran.alon@oracle.com>

I suggested this back in July, but Paolo didn't like it. I still like it. :-)

Reviewed-by: Jim Mattson <jmattson@google.com>

> ---
>  arch/x86/kvm/vmx.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index b97e0c5ccb46..98faba65c24a 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -12018,10 +12018,20 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>                 vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs);
>
>         if (enable_vpid) {
> -               if (nested_cpu_has_vpid(vmcs12) && vmx->nested.vpid02)
> -                       vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->nested.vpid02);
> -               else
> -                       vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid);
> +               u16 vmcs02_vpid;

You could simplify the logic below if you initialize vmcs02_vpid to vmx->vpid.
BTW, when did we switch to compilation options that allow mid-block
declarations? Woohoo!

> +               if (nested_cpu_has_vpid(vmcs12)) {
> +                       if (nested_cpu_has_ept(vmcs12))
> +                               vmcs02_vpid = vmcs12->virtual_processor_id;
> +                       else if (vmx->nested.vpid02)
> +                               vmcs02_vpid = vmx->nested.vpid02;
> +                       else
> +                               vmcs02_vpid = vmx->vpid;
> +               } else {
> +                       vmcs02_vpid = vmx->vpid;
> +               }
> +
> +               vmcs_write16(VIRTUAL_PROCESSOR_ID, vmcs02_vpid);
>         }
>
>         /*
> --
> 2.16.1
>
Liran Alon Sept. 6, 2018, 5:25 p.m. UTC | #2
> On 6 Sep 2018, at 19:06, Jim Mattson <jmattson@google.com> wrote:
> 
> On Thu, Sep 6, 2018 at 6:32 AM, Liran Alon <liran.alon@oracle.com> wrote:
>> If CPU use both VPID and EPT, TLB entries populated by CPU are tagged
>> with both EPTP and VPID. Therefore, if L1 uses EPT, L2 TLB entries
>> are separated from L1 TLB entries by the EPTP tags as vmcs02 use
>> EPTP02 while vmcs01 use EPTP01.
>> 
>> Thus, we don't need to make sure that vmcs02->vpid != vmcs01->vpid.
>> Therefore, we can just set vmcs02->vpid to vmcs12->vpid.
>> 
>> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
>> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
>> Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
>> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> 
> I suggested this back in July, but Paolo didn't like it. I still like it. :-)
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>

Yes, you are right this is based on your suggestion.
This is why I Cc you to this email thread. However, I forgot to:
Suggested-by: Jim Mattson <jmattson@google.com>
Paolo/Radim, if you decide to apply this, please add this tag to the commit.

Thanks,
-Liran

> 
>> ---
>> arch/x86/kvm/vmx.c | 18 ++++++++++++++----
>> 1 file changed, 14 insertions(+), 4 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index b97e0c5ccb46..98faba65c24a 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -12018,10 +12018,20 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>>                vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs);
>> 
>>        if (enable_vpid) {
>> -               if (nested_cpu_has_vpid(vmcs12) && vmx->nested.vpid02)
>> -                       vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->nested.vpid02);
>> -               else
>> -                       vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid);
>> +               u16 vmcs02_vpid;
> 
> You could simplify the logic below if you initialize vmcs02_vpid to vmx->vpid.
> BTW, when did we switch to compilation options that allow mid-block
> declarations? Woohoo!
> 
>> +               if (nested_cpu_has_vpid(vmcs12)) {
>> +                       if (nested_cpu_has_ept(vmcs12))
>> +                               vmcs02_vpid = vmcs12->virtual_processor_id;
>> +                       else if (vmx->nested.vpid02)
>> +                               vmcs02_vpid = vmx->nested.vpid02;
>> +                       else
>> +                               vmcs02_vpid = vmx->vpid;
>> +               } else {
>> +                       vmcs02_vpid = vmx->vpid;
>> +               }
>> +
>> +               vmcs_write16(VIRTUAL_PROCESSOR_ID, vmcs02_vpid);
>>        }
>> 
>>        /*
>> --
>> 2.16.1
>>
Paolo Bonzini Oct. 1, 2018, 12:56 p.m. UTC | #3
On 06/09/2018 18:06, Jim Mattson wrote:
> On Thu, Sep 6, 2018 at 6:32 AM, Liran Alon <liran.alon@oracle.com> wrote:
>> If CPU use both VPID and EPT, TLB entries populated by CPU are tagged
>> with both EPTP and VPID. Therefore, if L1 uses EPT, L2 TLB entries
>> are separated from L1 TLB entries by the EPTP tags as vmcs02 use
>> EPTP02 while vmcs01 use EPTP01.
>>
>> Thus, we don't need to make sure that vmcs02->vpid != vmcs01->vpid.
>> Therefore, we can just set vmcs02->vpid to vmcs12->vpid.
>>
>> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
>> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
>> Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
>> Signed-off-by: Liran Alon <liran.alon@oracle.com>
>
> I suggested this back in July, but Paolo didn't like it. I still like it. :-)
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>

The problem with this is still the same as in July, namely that if all
guests (at any level) share the VPID space, then L1 can force an
invalidation of any VPID (and thus slow down execution of other guests,
including siblings of L1) through INVVPID.

Paolo
Jim Mattson Oct. 1, 2018, 4:49 p.m. UTC | #4
On Mon, Oct 1, 2018 at 5:56 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 06/09/2018 18:06, Jim Mattson wrote:
>> On Thu, Sep 6, 2018 at 6:32 AM, Liran Alon <liran.alon@oracle.com> wrote:
>>> If CPU use both VPID and EPT, TLB entries populated by CPU are tagged
>>> with both EPTP and VPID. Therefore, if L1 uses EPT, L2 TLB entries
>>> are separated from L1 TLB entries by the EPTP tags as vmcs02 use
>>> EPTP02 while vmcs01 use EPTP01.
>>>
>>> Thus, we don't need to make sure that vmcs02->vpid != vmcs01->vpid.
>>> Therefore, we can just set vmcs02->vpid to vmcs12->vpid.
>>>
>>> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
>>> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
>>> Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
>>> Signed-off-by: Liran Alon <liran.alon@oracle.com>
>>
>> I suggested this back in July, but Paolo didn't like it. I still like it. :-)
>>
>> Reviewed-by: Jim Mattson <jmattson@google.com>
>
> The problem with this is still the same as in July, namely that if all
> guests (at any level) share the VPID space, then L1 can force an
> invalidation of any VPID (and thus slow down execution of other guests,
> including siblings of L1) through INVVPID.

Can't L1 already do this by filling the TLB with its own entries,
thereby evicting its siblings' entries?
Liran Alon Oct. 7, 2018, 8:14 p.m. UTC | #5
> On 1 Oct 2018, at 19:49, Jim Mattson <jmattson@google.com> wrote:
> 
> On Mon, Oct 1, 2018 at 5:56 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 06/09/2018 18:06, Jim Mattson wrote:
>>> On Thu, Sep 6, 2018 at 6:32 AM, Liran Alon <liran.alon@oracle.com> wrote:
>>>> If CPU use both VPID and EPT, TLB entries populated by CPU are tagged
>>>> with both EPTP and VPID. Therefore, if L1 uses EPT, L2 TLB entries
>>>> are separated from L1 TLB entries by the EPTP tags as vmcs02 use
>>>> EPTP02 while vmcs01 use EPTP01.
>>>> 
>>>> Thus, we don't need to make sure that vmcs02->vpid != vmcs01->vpid.
>>>> Therefore, we can just set vmcs02->vpid to vmcs12->vpid.
>>>> 
>>>> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
>>>> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
>>>> Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
>>>> Signed-off-by: Liran Alon <liran.alon@oracle.com>
>>> 
>>> I suggested this back in July, but Paolo didn't like it. I still like it. :-)
>>> 
>>> Reviewed-by: Jim Mattson <jmattson@google.com>
>> 
>> The problem with this is still the same as in July, namely that if all
>> guests (at any level) share the VPID space, then L1 can force an
>> invalidation of any VPID (and thus slow down execution of other guests,
>> including siblings of L1) through INVVPID.

Funny fact: This discussion have lead me to look at handle_invvpid() which L0 use to
emulate INVVPID for L1 and it seems to me to contain a serious issue.
All calls to __vmx_flush_tlb() is called with invalidate_gpa set to true which actually results
in L0 executing INVEPT(eptp01) instead of INVVPID(vmx->nested.vpid02).
(Bug was introduced by c2ba05ccfde2 ("KVM: X86: introduce invalidate_gpa argument to tlb flush”)).
I will send a very simple patch to fix this issue :)

In addition, it seems that the patch I have submitted here also lacks changing handle_invvpid()
to execute INVVPID(vmcs12->vpid) instead of INVVPID(vmx->nested.vpid02) in case vmcs12 use EPT.
I can fix this in v2 of this patch if we will decide it should be applied… (See below)

I understand the concern raised here as even a non-malicious L1 guest will likely hurt other guests
performance when executing a non-individual-address type INVVPID.
What we really wish to have here is a new Intel VMX instruction which allows invalidating TLB entries
by given EPTP *and* VPID. Too bad there isn’t such an instruction…

> 
> Can't L1 already do this by filling the TLB with its own entries,
> thereby evicting its siblings' entries?

Yes, but I agree that with this change, it is much easier for a L1 to just invalidate combined mappings used by other guests.
So I tend to agree with Paolo that maybe this patch should be dropped…

-Liran
Liran Alon Oct. 7, 2018, 8:50 p.m. UTC | #6
> On 7 Oct 2018, at 23:14, Liran Alon <liran.alon@oracle.com> wrote:
> 
> 
> 
>> On 1 Oct 2018, at 19:49, Jim Mattson <jmattson@google.com> wrote:
>> 
>> On Mon, Oct 1, 2018 at 5:56 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> On 06/09/2018 18:06, Jim Mattson wrote:
>>>> On Thu, Sep 6, 2018 at 6:32 AM, Liran Alon <liran.alon@oracle.com> wrote:
>>>>> If CPU use both VPID and EPT, TLB entries populated by CPU are tagged
>>>>> with both EPTP and VPID. Therefore, if L1 uses EPT, L2 TLB entries
>>>>> are separated from L1 TLB entries by the EPTP tags as vmcs02 use
>>>>> EPTP02 while vmcs01 use EPTP01.
>>>>> 
>>>>> Thus, we don't need to make sure that vmcs02->vpid != vmcs01->vpid.
>>>>> Therefore, we can just set vmcs02->vpid to vmcs12->vpid.
>>>>> 
>>>>> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
>>>>> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
>>>>> Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
>>>>> Signed-off-by: Liran Alon <liran.alon@oracle.com>
>>>> 
>>>> I suggested this back in July, but Paolo didn't like it. I still like it. :-)
>>>> 
>>>> Reviewed-by: Jim Mattson <jmattson@google.com>
>>> 
>>> The problem with this is still the same as in July, namely that if all
>>> guests (at any level) share the VPID space, then L1 can force an
>>> invalidation of any VPID (and thus slow down execution of other guests,
>>> including siblings of L1) through INVVPID.
> 
> Funny fact: This discussion have lead me to look at handle_invvpid() which L0 use to
> emulate INVVPID for L1 and it seems to me to contain a serious issue.
> All calls to __vmx_flush_tlb() is called with invalidate_gpa set to true which actually results
> in L0 executing INVEPT(eptp01) instead of INVVPID(vmx->nested.vpid02).
> (Bug was introduced by c2ba05ccfde2 ("KVM: X86: introduce invalidate_gpa argument to tlb flush”)).
> I will send a very simple patch to fix this issue :)
> 
> In addition, it seems that the patch I have submitted here also lacks changing handle_invvpid()
> to execute INVVPID(vmcs12->vpid) instead of INVVPID(vmx->nested.vpid02) in case vmcs12 use EPT.
> I can fix this in v2 of this patch if we will decide it should be applied… (See below)
> 
> I understand the concern raised here as even a non-malicious L1 guest will likely hurt other guests
> performance when executing a non-individual-address type INVVPID.
> What we really wish to have here is a new Intel VMX instruction which allows invalidating TLB entries
> by given EPTP *and* VPID. Too bad there isn’t such an instruction…
> 
>> 
>> Can't L1 already do this by filling the TLB with its own entries,
>> thereby evicting its siblings' entries?
> 
> Yes, but I agree that with this change, it is much easier for a L1 to just invalidate combined mappings used by other guests.
> So I tend to agree with Paolo that maybe this patch should be dropped…
> 
> -Liran

I have an idea on how to preserve the wanted functionality of this patch:
We could modify handle_invvpid() to emulate INVVPID of all types besides individual-address by executing INVEPT(eptp02) when vmcs12 uses EPT.
This will prevent L1 from invalidating TLB entries of other guests by executing INVVPID.

This will also result in every INVVPID of non-individual-address type to invalidate all L2 TLB entries of given eptp02 of all various L2 VPIDs.
However, it is not different than today in that regard, as today all L2 TLB entries are tagged with a single vmx->nested.vpid02.

Paolo, if you agree to this, I will create relevant patches for v2 of this series.

-Liran
Liran Alon Oct. 7, 2018, 10:15 p.m. UTC | #7
> On 7 Oct 2018, at 23:50, Liran Alon <liran.alon@oracle.com> wrote:
> 
> 
> 
>> On 7 Oct 2018, at 23:14, Liran Alon <liran.alon@oracle.com> wrote:
>> 
>> 
>> 
>>> On 1 Oct 2018, at 19:49, Jim Mattson <jmattson@google.com> wrote:
>>> 
>>> On Mon, Oct 1, 2018 at 5:56 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> On 06/09/2018 18:06, Jim Mattson wrote:
>>>>> On Thu, Sep 6, 2018 at 6:32 AM, Liran Alon <liran.alon@oracle.com> wrote:
>>>>>> If CPU use both VPID and EPT, TLB entries populated by CPU are tagged
>>>>>> with both EPTP and VPID. Therefore, if L1 uses EPT, L2 TLB entries
>>>>>> are separated from L1 TLB entries by the EPTP tags as vmcs02 use
>>>>>> EPTP02 while vmcs01 use EPTP01.
>>>>>> 
>>>>>> Thus, we don't need to make sure that vmcs02->vpid != vmcs01->vpid.
>>>>>> Therefore, we can just set vmcs02->vpid to vmcs12->vpid.
>>>>>> 
>>>>>> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
>>>>>> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
>>>>>> Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
>>>>>> Signed-off-by: Liran Alon <liran.alon@oracle.com>
>>>>> 
>>>>> I suggested this back in July, but Paolo didn't like it. I still like it. :-)
>>>>> 
>>>>> Reviewed-by: Jim Mattson <jmattson@google.com>
>>>> 
>>>> The problem with this is still the same as in July, namely that if all
>>>> guests (at any level) share the VPID space, then L1 can force an
>>>> invalidation of any VPID (and thus slow down execution of other guests,
>>>> including siblings of L1) through INVVPID.
>> 
>> Funny fact: This discussion have lead me to look at handle_invvpid() which L0 use to
>> emulate INVVPID for L1 and it seems to me to contain a serious issue.
>> All calls to __vmx_flush_tlb() is called with invalidate_gpa set to true which actually results
>> in L0 executing INVEPT(eptp01) instead of INVVPID(vmx->nested.vpid02).
>> (Bug was introduced by c2ba05ccfde2 ("KVM: X86: introduce invalidate_gpa argument to tlb flush”)).
>> I will send a very simple patch to fix this issue :)
>> 
>> In addition, it seems that the patch I have submitted here also lacks changing handle_invvpid()
>> to execute INVVPID(vmcs12->vpid) instead of INVVPID(vmx->nested.vpid02) in case vmcs12 use EPT.
>> I can fix this in v2 of this patch if we will decide it should be applied… (See below)
>> 
>> I understand the concern raised here as even a non-malicious L1 guest will likely hurt other guests
>> performance when executing a non-individual-address type INVVPID.
>> What we really wish to have here is a new Intel VMX instruction which allows invalidating TLB entries
>> by given EPTP *and* VPID. Too bad there isn’t such an instruction…
>> 
>>> 
>>> Can't L1 already do this by filling the TLB with its own entries,
>>> thereby evicting its siblings' entries?
>> 
>> Yes, but I agree that with this change, it is much easier for a L1 to just invalidate combined mappings used by other guests.
>> So I tend to agree with Paolo that maybe this patch should be dropped…
>> 
>> -Liran
> 
> I have an idea on how to preserve the wanted functionality of this patch:
> We could modify handle_invvpid() to emulate INVVPID of all types besides individual-address by executing INVEPT(eptp02) when vmcs12 uses EPT.
> This will prevent L1 from invalidating TLB entries of other guests by executing INVVPID.
> 
> This will also result in every INVVPID of non-individual-address type to invalidate all L2 TLB entries of given eptp02 of all various L2 VPIDs.
> However, it is not different than today in that regard, as today all L2 TLB entries are tagged with a single vmx->nested.vpid02.
> 
> Paolo, if you agree to this, I will create relevant patches for v2 of this series.
> 
> -Liran
> 
> 

Please ignore last email idea. Not really practical…
I don’t know what is worse in it: The fact there is no vmcs12 to check if it uses EPT
or the fact that there is no easy way to translate between a vmcs12->vpid to all the active eptp02 that it was used on…
Sorry for the spam. :)

I will just submit a fix for the bug of calling __vmx_flush_tlb() with bad parameter.

-Liran
Paolo Bonzini Oct. 8, 2018, 1:11 p.m. UTC | #8
On 08/10/2018 00:15, Liran Alon wrote:
> Please ignore last email idea. Not really practical…

No problem :)

> I don’t know what is worse in it: The fact there is no vmcs12 to check if it uses EPT
> or the fact that there is no easy way to translate between a vmcs12->vpid to all the active eptp02 that it was used on…
> Sorry for the spam. :)
> 
> I will just submit a fix for the bug of calling __vmx_flush_tlb() with bad parameter.

Thanks!

Paolo
Sean Christopherson Oct. 8, 2018, 1:24 p.m. UTC | #9
On Sun, 2018-10-07 at 23:14 +0300, Liran Alon wrote:
> 
> > 
> > On 1 Oct 2018, at 19:49, Jim Mattson <jmattson@google.com> wrote:
> > 
> > On Mon, Oct 1, 2018 at 5:56 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > 
> > > On 06/09/2018 18:06, Jim Mattson wrote:
> > > > 
> > > > On Thu, Sep 6, 2018 at 6:32 AM, Liran Alon <liran.alon@oracle.com> wrote:
> > > > > 
> > > > > If CPU use both VPID and EPT, TLB entries populated by CPU are tagged
> > > > > with both EPTP and VPID. Therefore, if L1 uses EPT, L2 TLB entries
> > > > > are separated from L1 TLB entries by the EPTP tags as vmcs02 use
> > > > > EPTP02 while vmcs01 use EPTP01.
> > > > > 
> > > > > Thus, we don't need to make sure that vmcs02->vpid != vmcs01->vpid.
> > > > > Therefore, we can just set vmcs02->vpid to vmcs12->vpid.
> > > > > 
> > > > > Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
> > > > > Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
> > > > > Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
> > > > > Signed-off-by: Liran Alon <liran.alon@oracle.com>
> > > > I suggested this back in July, but Paolo didn't like it. I still like it. :-)
> > > > 
> > > > Reviewed-by: Jim Mattson <jmattson@google.com>
> > > The problem with this is still the same as in July, namely that if all
> > > guests (at any level) share the VPID space, then L1 can force an
> > > invalidation of any VPID (and thus slow down execution of other guests,
> > > including siblings of L1) through INVVPID.
> Funny fact: This discussion have lead me to look at handle_invvpid() which L0 use to
> emulate INVVPID for L1 and it seems to me to contain a serious issue.
> All calls to __vmx_flush_tlb() is called with invalidate_gpa set to true which actually results
> in L0 executing INVEPT(eptp01) instead of INVVPID(vmx->nested.vpid02).
> (Bug was introduced by c2ba05ccfde2 ("KVM: X86: introduce invalidate_gpa argument to tlb flush”)).
> I will send a very simple patch to fix this issue :)

record_steal_time() calls __vmx_flush_tlb() with invalidate_gpa=false via
kvm_vcpu_flush_tlb().  And if you look closely at the diff, you can see
that prior to invalidate_gpa we unconditionally did INVEPT if enable_ept.
This was also discussed in depth in the context of a VPID series[1].
IIRC, the consensus at thay time was that while the VPID code is horribly
buggy, none of the bugs actually cause problems because we always flush
on a nested transition.

[1] https://www.spinics.net/lists/kvm/msg169752.html

> In addition, it seems that the patch I have submitted here also lacks changing handle_invvpid()
> to execute INVVPID(vmcs12->vpid) instead of INVVPID(vmx->nested.vpid02) in case vmcs12 use EPT.
> I can fix this in v2 of this patch if we will decide it should be applied… (See below)
> 
> I understand the concern raised here as even a non-malicious L1 guest will likely hurt other guests
> performance when executing a non-individual-address type INVVPID.
> What we really wish to have here is a new Intel VMX instruction which allows invalidating TLB entries
> by given EPTP *and* VPID. Too bad there isn’t such an instruction…
> 
> > 
> > 
> > Can't L1 already do this by filling the TLB with its own entries,
> > thereby evicting its siblings' entries?
> Yes, but I agree that with this change, it is much easier for a L1 to just invalidate combined mappings used by other guests.
> So I tend to agree with Paolo that maybe this patch should be dropped…
> 
> -Liran
> 
> 
>
Liran Alon Oct. 8, 2018, 1:32 p.m. UTC | #10
> On 8 Oct 2018, at 16:24, Sean Christopherson <sean.j.christopherson@intel.com> wrote:
> 
> On Sun, 2018-10-07 at 23:14 +0300, Liran Alon wrote:
>> 
>>> 
>>> On 1 Oct 2018, at 19:49, Jim Mattson <jmattson@google.com> wrote:
>>> 
>>> On Mon, Oct 1, 2018 at 5:56 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> 
>>>> On 06/09/2018 18:06, Jim Mattson wrote:
>>>>> 
>>>>> On Thu, Sep 6, 2018 at 6:32 AM, Liran Alon <liran.alon@oracle.com> wrote:
>>>>>> 
>>>>>> If CPU use both VPID and EPT, TLB entries populated by CPU are tagged
>>>>>> with both EPTP and VPID. Therefore, if L1 uses EPT, L2 TLB entries
>>>>>> are separated from L1 TLB entries by the EPTP tags as vmcs02 use
>>>>>> EPTP02 while vmcs01 use EPTP01.
>>>>>> 
>>>>>> Thus, we don't need to make sure that vmcs02->vpid != vmcs01->vpid.
>>>>>> Therefore, we can just set vmcs02->vpid to vmcs12->vpid.
>>>>>> 
>>>>>> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
>>>>>> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
>>>>>> Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
>>>>>> Signed-off-by: Liran Alon <liran.alon@oracle.com>
>>>>> I suggested this back in July, but Paolo didn't like it. I still like it. :-)
>>>>> 
>>>>> Reviewed-by: Jim Mattson <jmattson@google.com>
>>>> The problem with this is still the same as in July, namely that if all
>>>> guests (at any level) share the VPID space, then L1 can force an
>>>> invalidation of any VPID (and thus slow down execution of other guests,
>>>> including siblings of L1) through INVVPID.
>> Funny fact: This discussion have lead me to look at handle_invvpid() which L0 use to
>> emulate INVVPID for L1 and it seems to me to contain a serious issue.
>> All calls to __vmx_flush_tlb() is called with invalidate_gpa set to true which actually results
>> in L0 executing INVEPT(eptp01) instead of INVVPID(vmx->nested.vpid02).
>> (Bug was introduced by c2ba05ccfde2 ("KVM: X86: introduce invalidate_gpa argument to tlb flush”)).
>> I will send a very simple patch to fix this issue :)
> 
> record_steal_time() calls __vmx_flush_tlb() with invalidate_gpa=false via
> kvm_vcpu_flush_tlb().  And if you look closely at the diff, you can see
> that prior to invalidate_gpa we unconditionally did INVEPT if enable_ept.
> This was also discussed in depth in the context of a VPID series[1].
> IIRC, the consensus at thay time was that while the VPID code is horribly
> buggy, none of the bugs actually cause problems because we always flush
> on a nested transition.

Yes I am aware of that.

I currently already have a simple series which I believe fixes the remaining issues in VPID code.
Only waiting for a bit more testing and reviews. Will send shortly.

It is true that probably none of these bugs manifested because most L1 hypervisors use
INVEPT instead of INVVPID when they enable EPT. The only case which this is not true
is for TLB shootdown hypercall, which in this case L1 hypervisor may use INVVPID.

> 
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_kvm_msg169752.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0&m=ahHvfxys3ePYk78-Y2po88fibK8IsEgjLvC-5mRcOb8&s=t94xDNDrS8RKCYG83SJWc3xOcMR4M-RQ_whTU73VI04&e=
> 
>> In addition, it seems that the patch I have submitted here also lacks changing handle_invvpid()
>> to execute INVVPID(vmcs12->vpid) instead of INVVPID(vmx->nested.vpid02) in case vmcs12 use EPT.
>> I can fix this in v2 of this patch if we will decide it should be applied… (See below)
>> 
>> I understand the concern raised here as even a non-malicious L1 guest will likely hurt other guests
>> performance when executing a non-individual-address type INVVPID.
>> What we really wish to have here is a new Intel VMX instruction which allows invalidating TLB entries
>> by given EPTP *and* VPID. Too bad there isn’t such an instruction…
>> 
>>> 
>>> 
>>> Can't L1 already do this by filling the TLB with its own entries,
>>> thereby evicting its siblings' entries?
>> Yes, but I agree that with this change, it is much easier for a L1 to just invalidate combined mappings used by other guests.
>> So I tend to agree with Paolo that maybe this patch should be dropped…
>> 
>> -Liran
>> 
>> 
>>
Jim Mattson Oct. 8, 2018, 4:38 p.m. UTC | #11
On Sun, Oct 7, 2018 at 1:14 PM, Liran Alon <liran.alon@oracle.com> wrote:

> I understand the concern raised here as even a non-malicious L1 guest will likely hurt other guests
> performance when executing a non-individual-address type INVVPID.

You are assuming that L1 guests are CPU-overcommitted, so that
multiple L1 guests are time-sharing a logical CPU, and that the TLB
footprints of these L1 guests are small enough to share the TLBs.  Is
this really a common scenario, particularly when one or more of the L1
guests are large enough to be running nested guests?

Even a non-malicious L1 guest may quickly evict all of the TLB entries
for other L1 guests just through normal operation. The TLBs are not so
big that there's typically a lot of unused capacity just lying around.
diff mbox series

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index b97e0c5ccb46..98faba65c24a 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -12018,10 +12018,20 @@  static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 		vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs);
 
 	if (enable_vpid) {
-		if (nested_cpu_has_vpid(vmcs12) && vmx->nested.vpid02)
-			vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->nested.vpid02);
-		else
-			vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid);
+		u16 vmcs02_vpid;
+
+		if (nested_cpu_has_vpid(vmcs12)) {
+			if (nested_cpu_has_ept(vmcs12))
+				vmcs02_vpid = vmcs12->virtual_processor_id;
+			else if (vmx->nested.vpid02)
+				vmcs02_vpid = vmx->nested.vpid02;
+			else
+				vmcs02_vpid = vmx->vpid;
+		} else {
+			vmcs02_vpid = vmx->vpid;
+		}
+
+		vmcs_write16(VIRTUAL_PROCESSOR_ID, vmcs02_vpid);
 	}
 
 	/*