mbox series

[GIT,PULL] KVM changes for Linux 5.6-rc4

Message ID 1582570669-45822-1-git-send-email-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series [GIT,PULL] KVM changes for Linux 5.6-rc4 | expand

Pull-request

https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus

Message

Paolo Bonzini Feb. 24, 2020, 6:57 p.m. UTC
Linus,

The following changes since commit 120881b9e888689cbdb90a1dd1689684d8bc95f3:

  docs: virt: guest-halt-polling.txt convert to ReST (2020-02-12 20:10:08 +0100)

are available in the git repository at:

  https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus

for you to fetch changes up to a93236fcbe1d0248461b29c0f87cb0b510c94e6f:

  KVM: s390: rstify new ioctls in api.rst (2020-02-24 19:28:40 +0100)

----------------------------------------------------------------
Bugfixes, including the fix for CVE-2020-2732 and a few
issues found by "make W=1".

----------------------------------------------------------------
Christian Borntraeger (1):
      KVM: s390: rstify new ioctls in api.rst

Li RongQing (1):
      KVM: fix error handling in svm_hardware_setup

Miaohe Lin (4):
      KVM: nVMX: Fix some obsolete comments and grammar error
      KVM: x86: don't notify userspace IOAPIC on edge-triggered interrupt EOI
      KVM: apic: avoid calculating pending eoi from an uninitialized val
      KVM: SVM: Fix potential memory leak in svm_cpu_init()

Oliver Upton (3):
      KVM: nVMX: Emulate MTF when performing instruction emulation
      KVM: nVMX: Refactor IO bitmap checks into helper function
      KVM: nVMX: Check IO instruction VM-exit conditions

Paolo Bonzini (4):
      KVM: x86: enable -Werror
      KVM: x86: fix missing prototypes
      KVM: x86: fix incorrect comparison in trace event
      KVM: nVMX: Don't emulate instructions in guest mode

Qian Cai (1):
      kvm/emulate: fix a -Werror=cast-function-type

Suravee Suthikulpanit (1):
      kvm: x86: svm: Fix NULL pointer dereference when AVIC not enabled

Vitaly Kuznetsov (2):
      KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1
      KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when apicv is globally disabled

Xiaoyao Li (1):
      KVM: VMX: Add VMX_FEATURE_USR_WAIT_PAUSE

wanpeng li (1):
      KVM: nVMX: Hold KVM's srcu lock when syncing vmcs12->shadow

 Documentation/virt/kvm/api.rst     |  33 +++++-----
 arch/x86/include/asm/kvm_emulate.h |  13 +++-
 arch/x86/include/asm/kvm_host.h    |   3 +-
 arch/x86/include/asm/vmx.h         |   2 +-
 arch/x86/include/asm/vmxfeatures.h |   1 +
 arch/x86/include/uapi/asm/kvm.h    |   1 +
 arch/x86/kvm/Makefile              |   1 +
 arch/x86/kvm/emulate.c             |  36 ++++------
 arch/x86/kvm/irq_comm.c            |   2 +-
 arch/x86/kvm/lapic.c               |   9 ++-
 arch/x86/kvm/mmutrace.h            |   2 +-
 arch/x86/kvm/svm.c                 |  65 ++++++++++---------
 arch/x86/kvm/vmx/capabilities.h    |   1 +
 arch/x86/kvm/vmx/nested.c          |  89 ++++++++++++++++++-------
 arch/x86/kvm/vmx/nested.h          |  10 ++-
 arch/x86/kvm/vmx/vmx.c             | 130 +++++++++++++++++++++++++++++++------
 arch/x86/kvm/vmx/vmx.h             |   3 +
 arch/x86/kvm/x86.c                 |   2 +
 include/linux/kvm_host.h           |   2 +
 19 files changed, 284 insertions(+), 121 deletions(-)

Comments

pr-tracker-bot@kernel.org Feb. 24, 2020, 8:25 p.m. UTC | #1
The pull request you sent on Mon, 24 Feb 2020 19:57:49 +0100:

> https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/63623fd44972d1ed2bfb6e0fb631dfcf547fd1e7

Thank you!
Vitaly Kuznetsov March 2, 2020, 6:40 p.m. UTC | #2
Paolo Bonzini <pbonzini@redhat.com> writes:

>       KVM: nVMX: Don't emulate instructions in guest mode

I just discovered that this patch breaks Hyper-V on KVM completely;
Oliver's 86f7e90ce8 ("KVM: VMX: check descriptor table exits on
instruction emulation") doesn't fix it either. The breakage manifests
itself as

 qemu-system-x86-23579 [005] 22018.775584: kvm_exit:             reason EPT_VIOLATION rip 0xfffff802987d6169 info 181 0
 qemu-system-x86-23579 [005] 22018.775584: kvm_nested_vmexit:    rip fffff802987d6169 reason EPT_VIOLATION info1 181 info2 0 int_info 0 int_info_err 0
 qemu-system-x86-23579 [005] 22018.775585: kvm_page_fault:       address febd0000 error_code 181
 qemu-system-x86-23579 [005] 22018.775592: kvm_emulate_insn:     0:fffff802987d6169: f3 a5
 qemu-system-x86-23579 [005] 22018.775593: kvm_emulate_insn:     0:fffff802987d6169: f3 a5 FAIL
 qemu-system-x86-23579 [005] 22018.775596: kvm_inj_exception:    #UD (0x0)

We probably need to re-enable instruction emulation for something...
Paolo Bonzini March 3, 2020, 10:58 a.m. UTC | #3
On 02/03/20 19:40, Vitaly Kuznetsov wrote:
> 
>  qemu-system-x86-23579 [005] 22018.775584: kvm_exit:             reason EPT_VIOLATION rip 0xfffff802987d6169 info 181 0
>  qemu-system-x86-23579 [005] 22018.775584: kvm_nested_vmexit:    rip fffff802987d6169 reason EPT_VIOLATION info1 181 info2 0 int_info 0 int_info_err 0
>  qemu-system-x86-23579 [005] 22018.775585: kvm_page_fault:       address febd0000 error_code 181
>  qemu-system-x86-23579 [005] 22018.775592: kvm_emulate_insn:     0:fffff802987d6169: f3 a5
>  qemu-system-x86-23579 [005] 22018.775593: kvm_emulate_insn:     0:fffff802987d6169: f3 a5 FAIL
>  qemu-system-x86-23579 [005] 22018.775596: kvm_inj_exception:    #UD (0x0)
> 
> We probably need to re-enable instruction emulation for something...

This is a rep movsw instruction, it shouldn't be intercepted.  I think
we have a stale ctxt->intercept because the

        /* Fields above regs are cleared together. */

comment is not true anymore since

    commit c44b4c6ab80eef3a9c52c7b3f0c632942e6489aa
    Author: Bandan Das <bsd@redhat.com>
    Date:   Wed Apr 16 12:46:12 2014 -0400

    KVM: emulate: clean up initializations in init_decode_cache

    A lot of initializations are unnecessary as they get set to
    appropriate values before actually being used. Optimize
    placement of fields in x86_emulate_ctxt

    Signed-off-by: Bandan Das <bsd@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Paolo
Vitaly Kuznetsov March 3, 2020, 1:02 p.m. UTC | #4
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 02/03/20 19:40, Vitaly Kuznetsov wrote:
>> 
>>  qemu-system-x86-23579 [005] 22018.775584: kvm_exit:             reason EPT_VIOLATION rip 0xfffff802987d6169 info 181 0
>>  qemu-system-x86-23579 [005] 22018.775584: kvm_nested_vmexit:    rip fffff802987d6169 reason EPT_VIOLATION info1 181 info2 0 int_info 0 int_info_err 0
>>  qemu-system-x86-23579 [005] 22018.775585: kvm_page_fault:       address febd0000 error_code 181
>>  qemu-system-x86-23579 [005] 22018.775592: kvm_emulate_insn:     0:fffff802987d6169: f3 a5
>>  qemu-system-x86-23579 [005] 22018.775593: kvm_emulate_insn:     0:fffff802987d6169: f3 a5 FAIL
>>  qemu-system-x86-23579 [005] 22018.775596: kvm_inj_exception:    #UD (0x0)
>> 
>> We probably need to re-enable instruction emulation for something...
>
> This is a rep movsw instruction, it shouldn't be intercepted.  I think
> we have a stale ctxt->intercept because the
>
>         /* Fields above regs are cleared together. */
>
> comment is not true anymore since
>
>     commit c44b4c6ab80eef3a9c52c7b3f0c632942e6489aa
>     Author: Bandan Das <bsd@redhat.com>
>     Date:   Wed Apr 16 12:46:12 2014 -0400
>
>     KVM: emulate: clean up initializations in init_decode_cache
>

Right you are,

a big hammer like

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index 2a8f2bd..52c9bce 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -324,14 +324,6 @@ struct x86_emulate_ctxt {
         */
 
        /* current opcode length in bytes */
-       u8 opcode_len;
-       u8 b;
-       u8 intercept;
-       u8 op_bytes;
-       u8 ad_bytes;
-       struct operand src;
-       struct operand src2;
-       struct operand dst;
        union {
                int (*execute)(struct x86_emulate_ctxt *ctxt);
                fastop_t fop;
@@ -343,6 +335,14 @@ struct x86_emulate_ctxt {
         * or elsewhere
         */
        bool rip_relative;
+       u8 opcode_len;
+       u8 b;
+       u8 intercept;
+       u8 op_bytes;
+       u8 ad_bytes;
+       struct operand src;
+       struct operand src2;
+       struct operand dst;
        u8 rex_prefix;
        u8 lock_prefix;
        u8 rep_prefix;

seems to make the issue go away. (For those wondering why fielf
shuffling makes a difference: init_decode_cache() clears
[rip_relative, modrm) range) How did this even work before...
(I'm still looking at the code, stay tuned...)
Paolo Bonzini March 3, 2020, 1:07 p.m. UTC | #5
On 03/03/20 14:02, Vitaly Kuznetsov wrote:
> Right you are,
> 
> a big hammer like
> 
> diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
> index 2a8f2bd..52c9bce 100644
> --- a/arch/x86/include/asm/kvm_emulate.h
> +++ b/arch/x86/include/asm/kvm_emulate.h
> @@ -324,14 +324,6 @@ struct x86_emulate_ctxt {
>          */
>  
>         /* current opcode length in bytes */
> -       u8 opcode_len;
> -       u8 b;
> -       u8 intercept;
> -       u8 op_bytes;
> -       u8 ad_bytes;
> -       struct operand src;
> -       struct operand src2;
> -       struct operand dst;
>         union {
>                 int (*execute)(struct x86_emulate_ctxt *ctxt);
>                 fastop_t fop;
> @@ -343,6 +335,14 @@ struct x86_emulate_ctxt {
>          * or elsewhere
>          */
>         bool rip_relative;
> +       u8 opcode_len;
> +       u8 b;
> +       u8 intercept;
> +       u8 op_bytes;
> +       u8 ad_bytes;
> +       struct operand src;
> +       struct operand src2;
> +       struct operand dst;
>         u8 rex_prefix;
>         u8 lock_prefix;
>         u8 rep_prefix;
> 
> seems to make the issue go away. (For those wondering why fielf
> shuffling makes a difference: init_decode_cache() clears
> [rip_relative, modrm) range) How did this even work before...
> (I'm still looking at the code, stay tuned...)

On AMD, probably because all these instructions were normally trapped by L1.

Of these, however, most need not be zeroed again. op_bytes, ad_bytes,
opcode_len and b are initialized by x86_decode_insn, and dst/src/src2
also by decode_operand.  So only intercept is affected, adding
"ctxt->intercept = x86_intercept_none" should be enough.

Paolo
Vitaly Kuznetsov March 3, 2020, 1:38 p.m. UTC | #6
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 03/03/20 14:02, Vitaly Kuznetsov wrote:
>> Right you are,
>> 
>> a big hammer like
>> 
>> diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
>> index 2a8f2bd..52c9bce 100644
>> --- a/arch/x86/include/asm/kvm_emulate.h
>> +++ b/arch/x86/include/asm/kvm_emulate.h
>> @@ -324,14 +324,6 @@ struct x86_emulate_ctxt {
>>          */
>>  
>>         /* current opcode length in bytes */
>> -       u8 opcode_len;
>> -       u8 b;
>> -       u8 intercept;
>> -       u8 op_bytes;
>> -       u8 ad_bytes;
>> -       struct operand src;
>> -       struct operand src2;
>> -       struct operand dst;
>>         union {
>>                 int (*execute)(struct x86_emulate_ctxt *ctxt);
>>                 fastop_t fop;
>> @@ -343,6 +335,14 @@ struct x86_emulate_ctxt {
>>          * or elsewhere
>>          */
>>         bool rip_relative;
>> +       u8 opcode_len;
>> +       u8 b;
>> +       u8 intercept;
>> +       u8 op_bytes;
>> +       u8 ad_bytes;
>> +       struct operand src;
>> +       struct operand src2;
>> +       struct operand dst;
>>         u8 rex_prefix;
>>         u8 lock_prefix;
>>         u8 rep_prefix;
>> 
>> seems to make the issue go away. (For those wondering why fielf
>> shuffling makes a difference: init_decode_cache() clears
>> [rip_relative, modrm) range) How did this even work before...
>> (I'm still looking at the code, stay tuned...)
>
> On AMD, probably because all these instructions were normally trapped by L1.
>
> Of these, however, most need not be zeroed again. op_bytes, ad_bytes,
> opcode_len and b are initialized by x86_decode_insn, and dst/src/src2
> also by decode_operand.  So only intercept is affected, adding
> "ctxt->intercept = x86_intercept_none" should be enough.

This matches my findings, thank you! Patch[es] are coming.