From patchwork Sat Jul 23 00:51:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12926997 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E56E5CCA489 for ; Sat, 23 Jul 2022 00:51:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235934AbiGWAvr (ORCPT ); Fri, 22 Jul 2022 20:51:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235033AbiGWAvp (ORCPT ); Fri, 22 Jul 2022 20:51:45 -0400 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA65B101F3 for ; Fri, 22 Jul 2022 17:51:44 -0700 (PDT) Received: by mail-pj1-x104a.google.com with SMTP id r6-20020a17090a2e8600b001f0768a1af1so5003472pjd.8 for ; Fri, 22 Jul 2022 17:51:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=dDbBsniYzHp/DkWFlMs8orp7Z4WhIFDdPFs5eLnWS08=; b=WyABck1LPPvjC6je72o/2E18L+yh/Fefg4gzXmU5uIknt5PWWm5Juk50jGnGUtiQnX hL+pyIBGmbbkOSj3PqffyytipWE5zL2OikSniX+VkGUBtgsE/QHMIORPqtS+ULO3Zsaa ip8JF1lShaoow/UOoWUm2xuTcdwdRR7wAlhiAa4RLE45hJ8wce19sHdZdHdllfSkLK6V LOqw13njtxrvL89XeQ7yv0bKFbXteI/wBieQBJ+L+BYTKD4Dbm0ODZNY2jIuJxf9MG75 61waSCk9Pa1sLCf7zN07XU0d1YcML5jhU8F//+FbyNdZvfCA6QT4vDmCl3tA8S74YScj PyoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=dDbBsniYzHp/DkWFlMs8orp7Z4WhIFDdPFs5eLnWS08=; b=jToQvEF37ZknlaCFL/j5qXp2SoblLqeDxWR0wLzSv2tgIwPsuI9M65iV1B37FQxd1Q B/+nzd+708AfbE2XVdDX+F7BNQusv9pGXuFifHRWHbVjRWM3ljTtYzFri9CVZFczK2Zf MUURzlVa67uhC4NHPhKyocwQNjNxM8xE+NOSud5TvCIjKdR/oPdIWxnaXbVrNP/m61tu HVl7RwOe4oq8dQ59RBnHlize5Cb3R3DrydsI7LiCuzE6LIApJrKUaX6dqj6OVMBzz5XV u+F6xdryLKe7zMAE7h2wNMF1A0YrWgCBKg/Nx+biDWZbwQk+NgPiSmxNqDZZFfcVsPKl vtQA== X-Gm-Message-State: AJIora9cxWNhSU5/wcELqGvVkUdtJprxTXqDJEUoC/UN9urj5S6s+HvX SvfgtWScDhW5ZQHFmUhXqdNL/OK3EoU= X-Google-Smtp-Source: AGRyM1uJ5QTu58N+K9iKE/YpRT81L0TBA2fMLg0Trw/QkA46BeeOGxKp1UHrTTFXsIZm7jzXx47C/bCpyX4= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90a:7784:b0:1ef:c0fe:968c with SMTP id v4-20020a17090a778400b001efc0fe968cmr19848862pjk.26.1658537504313; Fri, 22 Jul 2022 17:51:44 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:14 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-2-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 01/24] KVM: nVMX: Unconditionally purge queued/injected events on nested "exit" From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Drop pending exceptions and events queued for re-injection when leaving nested guest mode, even if the "exit" is due to VM-Fail, SMI, or forced by host userspace. Failure to purge events could result in an event belonging to L2 being injected into L1. This _should_ never happen for VM-Fail as all events should be blocked by nested_run_pending, but it's possible if KVM, not the L1 hypervisor, is the source of VM-Fail when running vmcs02. SMI is a nop (barring unknown bugs) as recognition of SMI and thus entry to SMM is blocked by pending exceptions and re-injected events. Forced exit is definitely buggy, but has likely gone unnoticed because userspace probably follows the forced exit with KVM_SET_VCPU_EVENTS (or some other ioctl() that purges the queue). Fixes: 4f350c6dbcb9 ("kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/nested.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index b9425b142028..a980d9cbee60 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -4255,14 +4255,6 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, nested_vmx_abort(vcpu, VMX_ABORT_SAVE_GUEST_MSR_FAIL); } - - /* - * Drop what we picked up for L2 via vmx_complete_interrupts. It is - * preserved above and would only end up incorrectly in L1. - */ - vcpu->arch.nmi_injected = false; - kvm_clear_exception_queue(vcpu); - kvm_clear_interrupt_queue(vcpu); } /* @@ -4602,6 +4594,17 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason, WARN_ON_ONCE(nested_early_check); } + /* + * Drop events/exceptions that were queued for re-injection to L2 + * (picked up via vmx_complete_interrupts()), as well as exceptions + * that were pending for L2. Note, this must NOT be hoisted above + * prepare_vmcs12(), events/exceptions queued for re-injection need to + * be captured in vmcs12 (see vmcs12_save_pending_event()). + */ + vcpu->arch.nmi_injected = false; + kvm_clear_exception_queue(vcpu); + kvm_clear_interrupt_queue(vcpu); + vmx_switch_vmcs(vcpu, &vmx->vmcs01); /* Update any VMCS fields that might have changed while L2 ran */ From patchwork Sat Jul 23 00:51:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12926998 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26890C43334 for ; Sat, 23 Jul 2022 00:51:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236313AbiGWAvt (ORCPT ); Fri, 22 Jul 2022 20:51:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57228 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235880AbiGWAvr (ORCPT ); Fri, 22 Jul 2022 20:51:47 -0400 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E07D13D7A for ; Fri, 22 Jul 2022 17:51:46 -0700 (PDT) Received: by mail-pf1-x449.google.com with SMTP id r7-20020aa79627000000b00528beaf82c3so2447685pfg.8 for ; Fri, 22 Jul 2022 17:51:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=v9YL/H3DzLIILzqro/Rlg3LJnOCX+QvAwinAtI7kHXg=; b=gn5B7l1MV1xM263PvGhMVA5vgyKZt7Z4XHAuiFrUB2VZy6YH6ra5CcrARs/Y6iHcCF +4ZbVN0FULqDMLk1L7xjkrBBOmDMOSEVMq2Tf7EZIE0a2ArROgE64FGBHcIUjxv8URpL og2unMeMb9gJjgiufyZjcpQR1HlCebKsMysIMQUZC9k/71EMFGMeYDjBs8BHze6sVh5g TmNh521eGopMMmnCILeCw9ImkzA3WbKPWL1OuTtomlW8FRQELkpJCRzxh5OQtITnDExJ doVhpmUjxBLNUoOXV9TBsWLcMFCRx6wZkWUTUyZ6e+WTZ0njZPpsm6CQMx1hQ07o0bfn Mtiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=v9YL/H3DzLIILzqro/Rlg3LJnOCX+QvAwinAtI7kHXg=; b=jf12YWO/CniepCPUBAPlfp5SjposmRxAwlMxFILOC4K/BVQBdAmeTT8XXEaMkRhyTo SlhIv7I3KpCKu/XDHW1XHxDj6BZqKdJeoRSxioGqtX5y3IR8BQUpCeLzB2jtODa/HXOL DW9eHV/EzDTjnT2XSqCW65SVsCz8/EWWjIbBzMwOdXjlnWdO1lrHvws5UVXbOnmwWet1 SQABrBfJNpD/xs2+EpIG7IoA3aKmeXsyrJEV1X7tT8tH9vthREJcAY/+NPenwz1eRNw1 RLaeoycugiuwwQkYf9kC3T0HjYrvqP3GkGDvWcWBMuXhMPBR+IEXUYpBHS8UeoEWml1C DWVQ== X-Gm-Message-State: AJIora9yW/LFC4Ou+imiQ7lnCGowFt6vWprKSKDyZ1SPKdg/zYA9XM4h jj6zU2RoGViWN9hCM+IAo4IStFom1kQ= X-Google-Smtp-Source: AGRyM1ufY/lp5RZN+O14y0qhB6GMCXxqiD46hJ43a60IaDjAFWekqmkl9BFLXJMwvOiWlMA3Rc3YzkCdtV0= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:2401:b0:52b:cd67:d997 with SMTP id z1-20020a056a00240100b0052bcd67d997mr2490318pfh.70.1658537505696; Fri, 22 Jul 2022 17:51:45 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:15 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-3-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 02/24] KVM: VMX: Drop bits 31:16 when shoving exception error code into VMCS From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Deliberately truncate the exception error code when shoving it into the VMCS (VM-Entry field for vmcs01 and vmcs02, VM-Exit field for vmcs12). Intel CPUs are incapable of handling 32-bit error codes and will never generate an error code with bits 31:16, but userspace can provide an arbitrary error code via KVM_SET_VCPU_EVENTS. Failure to drop the bits on exception injection results in failed VM-Entry, as VMX disallows setting bits 31:16. Setting the bits on VM-Exit would at best confuse L1, and at worse induce a nested VM-Entry failure, e.g. if L1 decided to reinject the exception back into L2. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/nested.c | 11 ++++++++++- arch/x86/kvm/vmx/vmx.c | 12 +++++++++++- 2 files changed, 21 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index a980d9cbee60..c6f9fe0b6b33 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3827,7 +3827,16 @@ static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu, u32 intr_info = nr | INTR_INFO_VALID_MASK; if (vcpu->arch.exception.has_error_code) { - vmcs12->vm_exit_intr_error_code = vcpu->arch.exception.error_code; + /* + * Intel CPUs do not generate error codes with bits 31:16 set, + * and more importantly VMX disallows setting bits 31:16 in the + * injected error code for VM-Entry. Drop the bits to mimic + * hardware and avoid inducing failure on nested VM-Entry if L1 + * chooses to inject the exception back to L2. AMD CPUs _do_ + * generate "full" 32-bit error codes, so KVM allows userspace + * to inject exception error codes with bits 31:16 set. + */ + vmcs12->vm_exit_intr_error_code = (u16)vcpu->arch.exception.error_code; intr_info |= INTR_INFO_DELIVER_CODE_MASK; } diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 4fd25e1d6ec9..1c72cde600d0 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1621,7 +1621,17 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu) kvm_deliver_exception_payload(vcpu); if (has_error_code) { - vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code); + /* + * Despite the error code being architecturally defined as 32 + * bits, and the VMCS field being 32 bits, Intel CPUs and thus + * VMX don't actually supporting setting bits 31:16. Hardware + * will (should) never provide a bogus error code, but AMD CPUs + * do generate error codes with bits 31:16 set, and so KVM's + * ABI lets userspace shove in arbitrary 32-bit values. Drop + * the upper bits to avoid VM-Fail, losing information that + * does't really exist is preferable to killing the VM. + */ + vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, (u16)error_code); intr_info |= INTR_INFO_DELIVER_CODE_MASK; } From patchwork Sat Jul 23 00:51:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12926999 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2ED0C433EF for ; Sat, 23 Jul 2022 00:51:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236454AbiGWAv5 (ORCPT ); Fri, 22 Jul 2022 20:51:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57250 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236160AbiGWAvs (ORCPT ); Fri, 22 Jul 2022 20:51:48 -0400 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B31CC13E04 for ; Fri, 22 Jul 2022 17:51:47 -0700 (PDT) Received: by mail-pj1-x1049.google.com with SMTP id 1-20020a17090a190100b001f05565f004so2620572pjg.0 for ; Fri, 22 Jul 2022 17:51:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=cAX1yeYYe1qjAZf7Suz3oA25ZI3D0KwAESBSQk1/8Uc=; b=cOXrACnvJt1gAw6Hnq8VaDtjh0r3jmfmqe7BQWGe611UOjBM9A5zup37jEDgwM4vaL o5ZYdhY9UkfhzxJyITivmKwh3lx2nGpIihLMSl+Y9bWRIHbBShT6+Iyn6QcBh3JHH/+E afpoSxPwU8NjoH7kZoGFKflgeyNYocJHtgn/tUsNeJjxNWAluq2gG4qoYpLkXvraqcbM +IM9nfaz34Esk7x+4InPy7IqxEHeGRtuSqVGK5LBCRjKI3H24y4SOYZ9C/CxVRQnfYnC V2bG75ssa7q8DpPYLPg0+hUuoU8qYhU+55GHSflgDDcGPRcCWBbn9l04dWS6+fd02ef2 2czg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=cAX1yeYYe1qjAZf7Suz3oA25ZI3D0KwAESBSQk1/8Uc=; b=0xfiKuRlBZHQ0eUZVdMfAQdY0a70UOixBYopuT86IsbRmsDP+0wk1CwRV2QrjMpMOr 39CwlEe5Mcf2eC0pLLa+WytVMiIbQvHZCXZW2ztm4k1dnSQQnMjXsaUBqR+BfVPnPvAr Qx4tRNPU0SofLNmYG9RDjtiPUbHlzq57g7sWETMwQFX0WOwzRWWHf5LQs4u8MYM1N+21 EwE0NXdoQuAUyTMnGpSP1oInK3exfG/1NESOqwhfKrPZ+/ZdbPlANnXwm2gTPEmnFYvG 2ykCZqk8528nS98DZXdab3A06ztBzAPFVudBTj6+RK+FXghsSkbZV1ump9o3O4cmxpIR ahxg== X-Gm-Message-State: AJIora8H4V448XRYwqLtXOdrjpHMKmQ+FqF0DDVtNoZ8V17SqXG9yZo6 KZMAcuWf/k+qGGAQ43+lLFztH4VguhQ= X-Google-Smtp-Source: AGRyM1ulE71cWQYBxtp6QFCb3feddEWBkaHLlNHeg6YlqDZv1Upu2hGFboVtRurkns20hxNN61R1FG282Bg= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:ec8b:b0:16c:20d4:eb3 with SMTP id x11-20020a170902ec8b00b0016c20d40eb3mr2384465plg.40.1658537507307; Fri, 22 Jul 2022 17:51:47 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:16 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-4-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 03/24] KVM: x86: Don't check for code breakpoints when emulating on exception From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Don't check for code breakpoints during instruction emulation if the emulation was triggered by exception interception. Code breakpoints are the highest priority fault-like exception, and KVM only emulates on exceptions that are fault-like. Thus, if hardware signaled a different exception, then the vCPU is already passed the stage of checking for hardware breakpoints. This is likely a glorified nop in terms of functionality, and is more for clarification and is technically an optimization. Intel's SDM explicitly states vmcs.GUEST_RFLAGS.RF on exception interception is the same as the value that would have been saved on the stack had the exception not been intercepted, i.e. will be '1' due to all fault-like exceptions setting RF to '1'. AMD says "guest state saved ... is the processor state as of the moment the intercept triggers", but that begs the question, "when does the intercept trigger?". Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/x86.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5366f884e9a7..566f9512b4a3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8523,8 +8523,24 @@ int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(kvm_skip_emulated_instruction); -static bool kvm_vcpu_check_code_breakpoint(struct kvm_vcpu *vcpu, int *r) +static bool kvm_vcpu_check_code_breakpoint(struct kvm_vcpu *vcpu, + int emulation_type, int *r) { + WARN_ON_ONCE(emulation_type & EMULTYPE_NO_DECODE); + + /* + * Do not check for code breakpoints if hardware has already done the + * checks, as inferred from the emulation type. On NO_DECODE and SKIP, + * the instruction has passed all exception checks, and all intercepted + * exceptions that trigger emulation have lower priority than code + * breakpoints, i.e. the fact that the intercepted exception occurred + * means any code breakpoints have already been serviced. + */ + if (emulation_type & (EMULTYPE_NO_DECODE | EMULTYPE_SKIP | + EMULTYPE_TRAP_UD | EMULTYPE_TRAP_UD_FORCED | + EMULTYPE_VMWARE_GP | EMULTYPE_PF)) + return false; + if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) && (vcpu->arch.guest_debug_dr7 & DR7_BP_EN_MASK)) { struct kvm_run *kvm_run = vcpu->run; @@ -8646,8 +8662,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, * are fault-like and are higher priority than any faults on * the code fetch itself. */ - if (!(emulation_type & EMULTYPE_SKIP) && - kvm_vcpu_check_code_breakpoint(vcpu, &r)) + if (kvm_vcpu_check_code_breakpoint(vcpu, emulation_type, &r)) return r; r = x86_decode_emulated_instruction(vcpu, emulation_type, From patchwork Sat Jul 23 00:51:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927000 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2012CCA489 for ; Sat, 23 Jul 2022 00:52:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236315AbiGWAv7 (ORCPT ); Fri, 22 Jul 2022 20:51:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57540 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236371AbiGWAvz (ORCPT ); Fri, 22 Jul 2022 20:51:55 -0400 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E27E87C15 for ; Fri, 22 Jul 2022 17:51:49 -0700 (PDT) Received: by mail-pg1-x54a.google.com with SMTP id f128-20020a636a86000000b0041a4b07b039so2983199pgc.5 for ; Fri, 22 Jul 2022 17:51:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=oN29UI/ppYkSTKD+t54Fsh7oXFE1JTy24JfG7uJwp+g=; b=P5Snt+Szs0btQEkwNsKlQsBsZ//yOjKpu3fFtCbh9Yf3WsVD9ZHIVH6Or1UVpU+GyU nzdx2XAWNf6mdZFLZE/sII4+aedz7LAnwRsVAwNwpAV77g0yOJ8Lm6orpeatUonvUZV6 yIbkNjUab4SIyZsgAQSorUbUpcuEh3UsrDy3Cn+OudgcRlKe44g/nctkwQfgUFgF56oF 6eSA54ruO1INLcVTC0PKJmAHe3Pm1Y9yFnYbJmsrlXu78yA6kQg70N+B9Y+3q31t25ES +DD1wG9/uzeJSp3ZxrD09FmlZzx+IZavN1XL5REiMNZMvJjL8ErNL9RvE0ph6sw908py N1Bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=oN29UI/ppYkSTKD+t54Fsh7oXFE1JTy24JfG7uJwp+g=; b=YSRb0pGXC8LBz7eJxOO+vlYOLO5HsXOP2TSeM6/L7h3vhWvQbaceARvIBZEPLwPPqu tfmabLountGv3v1ZegEBybLkzjmLhfs66bh0JCXrIDPq41e+CoNx2n92lvWAeNNwLlyo qKdAEtfnnVbcmgDe0V1mMT0r3Ftv6nlRg+PMGAhBYvI4z36yGOs7oNC7/AOJnuLNtWOo MLVRkFpn7i2E0H9JtexOuYz9PwYpdtGTHv2mLzbvXNcLT8HZ0D9XGH7i5f6bQQmV4j7j LW0Jx0PCfQtywthliPqhc/NR6sB7MWyn4tRyq3OUkbnU7c4jj9tojJ8VhHijng1dmuoO KGZQ== X-Gm-Message-State: AJIora+HcL1mq6h+tjuEFcTwh8q1eRno8nF+KGovvUpC5SdT4YWveN41 yKYjVKEHrTflOpX8qXPp1C/+AKY/7J4= X-Google-Smtp-Source: AGRyM1udBKVmz+vqWaGzV1DyeEBlePIMHNBrFBIejJQhx8uwLi9YR1FvxY1ALPVdpVOXoPzQsVWHTSkoWH4= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90b:1bcb:b0:1ef:a6c3:30c6 with SMTP id oa11-20020a17090b1bcb00b001efa6c330c6mr20541959pjb.37.1658537508651; Fri, 22 Jul 2022 17:51:48 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:17 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-5-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 04/24] KVM: nVMX: Treat General Detect #DB (DR7.GD=1) as fault-like From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Exclude General Detect #DBs, which have fault-like behavior but also have a non-zero payload (DR6.BD=1), from nVMX's handling of pending debug traps. Opportunistically rewrite the comment to better document what is being checked, i.e. "has a non-zero payload" vs. "has a payload", and to call out the many caveats surrounding #DBs that KVM dodges one way or another. Cc: Oliver Upton Cc: Peter Shier Fixes: 684c0422da71 ("KVM: nVMX: Handle pending #DB when injecting INIT VM-exit") Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/nested.c | 36 +++++++++++++++++++++++++----------- 1 file changed, 25 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index c6f9fe0b6b33..456449778598 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3853,16 +3853,29 @@ static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu, } /* - * Returns true if a debug trap is pending delivery. + * Returns true if a debug trap is (likely) pending delivery. Infer the class + * of a #DB (trap-like vs. fault-like) from the exception payload (to-be-DR6). + * Using the payload is flawed because code breakpoints (fault-like) and data + * breakpoints (trap-like) set the same bits in DR6 (breakpoint detected), i.e. + * this will return false positives if a to-be-injected code breakpoint #DB is + * pending (from KVM's perspective, but not "pending" across an instruction + * boundary). ICEBP, a.k.a. INT1, is also not reflected here even though it + * too is trap-like. * - * In KVM, debug traps bear an exception payload. As such, the class of a #DB - * exception may be inferred from the presence of an exception payload. + * KVM "works" despite these flaws as ICEBP isn't currently supported by the + * emulator, Monitor Trap Flag is not marked pending on intercepted #DBs (the + * #DB has already happened), and MTF isn't marked pending on code breakpoints + * from the emulator (because such #DBs are fault-like and thus don't trigger + * actions that fire on instruction retire). */ -static inline bool vmx_pending_dbg_trap(struct kvm_vcpu *vcpu) +static inline unsigned long vmx_get_pending_dbg_trap(struct kvm_vcpu *vcpu) { - return vcpu->arch.exception.pending && - vcpu->arch.exception.nr == DB_VECTOR && - vcpu->arch.exception.payload; + if (!vcpu->arch.exception.pending || + vcpu->arch.exception.nr != DB_VECTOR) + return 0; + + /* General Detect #DBs are always fault-like. */ + return vcpu->arch.exception.payload & ~DR6_BD; } /* @@ -3874,9 +3887,10 @@ static inline bool vmx_pending_dbg_trap(struct kvm_vcpu *vcpu) */ static void nested_vmx_update_pending_dbg(struct kvm_vcpu *vcpu) { - if (vmx_pending_dbg_trap(vcpu)) - vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS, - vcpu->arch.exception.payload); + unsigned long pending_dbg = vmx_get_pending_dbg_trap(vcpu); + + if (pending_dbg) + vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS, pending_dbg); } static bool nested_vmx_preemption_timer_pending(struct kvm_vcpu *vcpu) @@ -3933,7 +3947,7 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) * while delivering the pending exception. */ - if (vcpu->arch.exception.pending && !vmx_pending_dbg_trap(vcpu)) { + if (vcpu->arch.exception.pending && !vmx_get_pending_dbg_trap(vcpu)) { if (vmx->nested.nested_run_pending) return -EBUSY; if (!nested_vmx_check_exception(vcpu, &exit_qual)) From patchwork Sat Jul 23 00:51:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927001 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93FF5C43334 for ; Sat, 23 Jul 2022 00:52:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236660AbiGWAwA (ORCPT ); Fri, 22 Jul 2022 20:52:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236447AbiGWAv4 (ORCPT ); Fri, 22 Jul 2022 20:51:56 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C98788767 for ; Fri, 22 Jul 2022 17:51:51 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id z6-20020a056902054600b00670e3c8b43fso2657849ybs.23 for ; Fri, 22 Jul 2022 17:51:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=f8GyotJFmeLYyYzTha23JlR3kgjrkc9nFL5Ec/loSRI=; b=Yd9z2Jplm3eV1w/a/hrPMLm3C3a+cD40y+2Sqp92HUKFQ5tDi7wcfJ0oEG3kNjTee2 dISrKizEMjEfO7imToA4mU64rfvbrl2Bd41hOdt1gbvB57hrIh+na7+ygq50w0i4RcTi ZhE9NZPn9fmZTkC2Gu0x0W3PIJIkYYKkkLTyYxMTRYPfT3abSSkQub1IZSfUNS4aVsXI csuU8+LP9AsfHYf0Rmb5fv4ddSidsfQXThB1rlciZgVtBh85gNLzDb2hgCKDftDPrDVV ia7GLx7WTs7lqYA9sLgwASYr5razZj/uyUqNwneMhjqVpTCz8KsWGSufx2gGXwvD8bNb U4OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=f8GyotJFmeLYyYzTha23JlR3kgjrkc9nFL5Ec/loSRI=; b=c5oDeGSOzj06ULO7d7AF5SJnR/Fl2F9NSPut8+X27BWFOcOnHQabFMOGieuUEEDlpt OsHzlrivRwyJ2n8cW4/pHXtoM7IUUyiRR8rNslsnbbB4jMPWv7RupxPdfGszcer76pSd S9BbHTmtU6BvpAO5u+7iH8QuDLla2y967cdOIyVQQM4D9fmWy/YRKW/f04RTnYsJRy53 h0dI4Bgch+CDdmpbcOhf4zARaNCM3v6RXyZPMoDS+SafMESnshNa/LnWCwmghymru/Wj a+sUvGb97yze4o6sJf5RtZuYHzXB8ndPeKCMRg4vNshekKmnprSLuyLaDkEzfDg7O0ka 6pmQ== X-Gm-Message-State: AJIora9Ypv1E2pY9c3x4dghJ3onmt/0h1ViDqMORDQ+rhUDyaahzKuU2 UByopeI5xwAfc7IB2ryw+5zfRjBmAQQ= X-Google-Smtp-Source: AGRyM1sgyhp9a5lQ6bjYiJsvHVkWKbsxLvnsPELm9w4w/UWzf+gSjkaZxI09MiazwcxB/ufnKJsqbLKLAAk= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:a194:0:b0:31e:62c4:e710 with SMTP id y142-20020a81a194000000b0031e62c4e710mr2095659ywg.312.1658537510167; Fri, 22 Jul 2022 17:51:50 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:18 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-6-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 05/24] KVM: nVMX: Prioritize TSS T-flag #DBs over Monitor Trap Flag From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Service TSS T-flag #DBs prior to pending MTFs, as such #DBs are higher priority than MTF. KVM itself doesn't emulate TSS #DBs, and any such exceptions injected from L1 will be handled by hardware (or morphed to a fault-like exception if injection fails), but theoretically userspace could pend a TSS T-flag #DB in conjunction with a pending MTF. Note, there's no known use case this fixes, it's purely to be technically correct with respect to Intel's SDM. Cc: Oliver Upton Cc: Peter Shier Fixes: 5ef8acbdd687 ("KVM: nVMX: Emulate MTF when performing instruction emulation") Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/nested.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 456449778598..6b4368d96d9e 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3939,15 +3939,17 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) } /* - * Process any exceptions that are not debug traps before MTF. + * Process exceptions that are higher priority than Monitor Trap Flag: + * fault-like exceptions, TSS T flag #DB (not emulated by KVM, but + * could theoretically come in from userspace), and ICEBP (INT1). * * Note that only a pending nested run can block a pending exception. * Otherwise an injected NMI/interrupt should either be * lost or delivered to the nested hypervisor in the IDT_VECTORING_INFO, * while delivering the pending exception. */ - - if (vcpu->arch.exception.pending && !vmx_get_pending_dbg_trap(vcpu)) { + if (vcpu->arch.exception.pending && + !(vmx_get_pending_dbg_trap(vcpu) & ~DR6_BT)) { if (vmx->nested.nested_run_pending) return -EBUSY; if (!nested_vmx_check_exception(vcpu, &exit_qual)) From patchwork Sat Jul 23 00:51:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927002 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 438B9C433EF for ; Sat, 23 Jul 2022 00:52:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236722AbiGWAwL (ORCPT ); Fri, 22 Jul 2022 20:52:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236535AbiGWAv5 (ORCPT ); Fri, 22 Jul 2022 20:51:57 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBF168AEEE for ; Fri, 22 Jul 2022 17:51:52 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-31e55518830so51101687b3.23 for ; Fri, 22 Jul 2022 17:51:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=CRV49EpunAiwzO7EUQf5uuxbCrNu29jLzCse2H7RuXc=; b=c4OhMoRdJJpj8Re9fEOkPZtmzB/dBJ4KFgrGtN3LWRpkWFBvNcpeCZBSL4aAWcYCki Qb6HSQ6WlUba6RIm/3e1UASfPJC/8qojnfgyNJP9KmikUVumKKNenZr/WzD9hCcaz37+ gQzL0XjuNZh0gdojlMcl4VoDcQvE9bM99Zaxckc9m3aUQitP6RQ+veZOltTqHhZU+7zb tOZD3DgnOzvc9He0UpgCBz91rH0vkzt1WOOMHstLoRkCsfEdre8wXI13KPJJCYxYTHFD rtj2qjZSAmxccJJ3hruDF5+KTBhvT8mkXb3Ymr+DUJ2ZOonaMlnlqvtl47uZXs6boHAd +Vbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=CRV49EpunAiwzO7EUQf5uuxbCrNu29jLzCse2H7RuXc=; b=kRME9HAjTELthFfksJdFLs8riuRow8cNEjYMXWNkNcZ1CL3YKwCax4fSHjh+P875ft Dum0lWIm+ejLRw2bCG3SMwTlt0hAMIjjxhz/NxSvxFF3QrTOVgVikIxWRH2uHfQ4dAzs aZ0Xro4mzN+Ae3Zoh7l1kknsardOFD4MszUtNlhhCq79LncDyDvz9afVRe2XjJaKHv6W 8gACfxbH5VyXgafbh2uCP8LEQbffYq//qJ62ztCqh3S+XeN1CjAh6hXLQPCnbJtfeO5Z JK+7p+zQKiTciljNgKSybQVzf2Qa3A3rrZdBDMQh2pocTF0OS0eudud3JjNXDb+zXfTa hoBA== X-Gm-Message-State: AJIora+inwtKvKEwXhWrtGiWu57Gyc2HXS0tPa5xAtzPItfX2c+MHWkc IUx1ZmVE4D5cZuiubLinpfgsx9MbTYM= X-Google-Smtp-Source: AGRyM1uHvgXaZ2c/91aauxBZFNtIhC3sObIgwcUCYoFubQr5z9HKWdAhXjdpatBXsBJXAt9u8lVwIUReu5g= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a0d:d909:0:b0:31e:6adb:95f4 with SMTP id b9-20020a0dd909000000b0031e6adb95f4mr2072072ywe.355.1658537511856; Fri, 22 Jul 2022 17:51:51 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:19 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-7-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 06/24] KVM: x86: Treat #DBs from the emulator as fault-like (code and DR7.GD=1) From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a dedicated "exception type" for #DBs, as #DBs can be fault-like or trap-like depending the sub-type of #DB, and effectively defer the decision of what to do with the #DB to the caller. For the emulator's two calls to exception_type(), treat the #DB as fault-like, as the emulator handles only code breakpoint and general detect #DBs, both of which are fault-like. For event injection, which uses exception_type() to determine whether to set EFLAGS.RF=1 on the stack, keep the current behavior of not setting RF=1 for #DBs. Intel and AMD explicitly state RF isn't set on code #DBs, so exempting by failing the "== EXCPT_FAULT" check is correct. The only other fault-like #DB is General Detect, and despite Intel and AMD both strongly implying (through omission) that General Detect #DBs should set RF=1, hardware (multiple generations of both Intel and AMD), in fact does not. Through insider knowledge, extreme foresight, sheer dumb luck, or some combination thereof, KVM correctly handled RF for General Detect #DBs. Fixes: 38827dbd3fb8 ("KVM: x86: Do not update EFLAGS on faulting emulation") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/x86.c | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 566f9512b4a3..68fb6393c96f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -529,6 +529,7 @@ static int exception_class(int vector) #define EXCPT_TRAP 1 #define EXCPT_ABORT 2 #define EXCPT_INTERRUPT 3 +#define EXCPT_DB 4 static int exception_type(int vector) { @@ -539,8 +540,14 @@ static int exception_type(int vector) mask = 1 << vector; - /* #DB is trap, as instruction watchpoints are handled elsewhere */ - if (mask & ((1 << DB_VECTOR) | (1 << BP_VECTOR) | (1 << OF_VECTOR))) + /* + * #DBs can be trap-like or fault-like, the caller must check other CPU + * state, e.g. DR6, to determine whether a #DB is a trap or fault. + */ + if (mask & (1 << DB_VECTOR)) + return EXCPT_DB; + + if (mask & ((1 << BP_VECTOR) | (1 << OF_VECTOR))) return EXCPT_TRAP; if (mask & ((1 << DF_VECTOR) | (1 << MC_VECTOR))) @@ -8791,6 +8798,12 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, unsigned long rflags = static_call(kvm_x86_get_rflags)(vcpu); toggle_interruptibility(vcpu, ctxt->interruptibility); vcpu->arch.emulate_regs_need_sync_to_vcpu = false; + + /* + * Note, EXCPT_DB is assumed to be fault-like as the emulator + * only supports code breakpoints and general detect #DB, both + * of which are fault-like. + */ if (!ctxt->have_exception || exception_type(ctxt->exception.vector) == EXCPT_TRAP) { kvm_pmu_trigger_event(vcpu, PERF_COUNT_HW_INSTRUCTIONS); @@ -9714,6 +9727,16 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit) /* try to inject new event if pending */ if (vcpu->arch.exception.pending) { + /* + * Fault-class exceptions, except #DBs, set RF=1 in the RFLAGS + * value pushed on the stack. Trap-like exception and all #DBs + * leave RF as-is (KVM follows Intel's behavior in this regard; + * AMD states that code breakpoint #DBs excplitly clear RF=0). + * + * Note, most versions of Intel's SDM and AMD's APM incorrectly + * describe the behavior of General Detect #DBs, which are + * fault-like. They do _not_ set RF, a la code breakpoints. + */ if (exception_type(vcpu->arch.exception.nr) == EXCPT_FAULT) __kvm_set_rflags(vcpu, kvm_get_rflags(vcpu) | X86_EFLAGS_RF); From patchwork Sat Jul 23 00:51:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927003 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D003FC433EF for ; Sat, 23 Jul 2022 00:52:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236819AbiGWAwT (ORCPT ); Fri, 22 Jul 2022 20:52:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57740 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236568AbiGWAv6 (ORCPT ); Fri, 22 Jul 2022 20:51:58 -0400 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F77D8CC96 for ; Fri, 22 Jul 2022 17:51:54 -0700 (PDT) Received: by mail-pj1-x104a.google.com with SMTP id b10-20020a17090a6e0a00b001f221432098so2695897pjk.0 for ; Fri, 22 Jul 2022 17:51:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=RtVR2Tkco0tuXD7bLAFV7Fg1ldNSfzq0D9HUqql94JM=; b=JlCKvFLufslNPFBb2FTFua18M+4EkJnw4RZBxfm4Kn7ZqfIDkj+uEaVJRC1D4a9UuG Bqkv1vEvi8eJ1dmoYbX8sOEyx0atakWIVMTcSn6S0oS0ONdVBGxmtB35y8BtmLNFwjFM Dr33xD2jTN9GLFmWh6aYzXEADCXXw19/QpupXTlCSSLSBQTXRMF/TviTi01uQ9oaCOBy XctvQJWiXIIW50Mg0Ld6F6SIMU0KRH3EVKNWFsp0I2Pg/az3oMZnCT1w4NkdzcP1dy6h 3tOeZjPdmMnhi8hAGvyd2n1Sx4aHvDGOBBd8JInbNURIUXHQFa7BJEt1jV9lxFbaHCbs 1Stw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=RtVR2Tkco0tuXD7bLAFV7Fg1ldNSfzq0D9HUqql94JM=; b=XOebge3zK0L4fCK1NLkpjbdYGCKYXgJDipbJ9zcGPNMneNokTirD28txGa3t7RJDv+ JRz/GUvqt5eZwwpolB2ucThiAC6I/XOWgt3raINvhylJvRhR5vmepEjosSqRGXBUaaHa oyCe6rnzt26QUqXHBkiP9kkdI8mKot/rO7Cmi59SYJN8yH0cYTkgzKXMHyBHjTCDxVx5 L88tOoMtL9TmsSmzPKMSdaz6kXxhpTLxZBTryp2+t340mxsXUX/l+QwHIcnxaP/6qcwW wiV+zToTd+enwtjHniKYFSZyuQ3YEmZ+1EtXuqNXpNIooQ04XF+mILmqoKHS4W4xDmV/ 5QVw== X-Gm-Message-State: AJIora+1bKxYZv7jBiL+H6OCK9rgmerOS7tIOqMSDrkl0SSE3xv68Qx5 Wnh7+Ij52M9Ksp54+YWWh0c4sc/cyHU= X-Google-Smtp-Source: AGRyM1tDknBN2M+wi6EP6p98NDwZtB00KTdlSsvJ1NXCoLJgAlNlfqBfaq7Td2vZ3FATLnMSGw7V9BakbVc= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90b:4c08:b0:1f2:977:c147 with SMTP id na8-20020a17090b4c0800b001f20977c147mr20329213pjb.6.1658537513521; Fri, 22 Jul 2022 17:51:53 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:20 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-8-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 07/24] KVM: x86: Use DR7_GD macro instead of open coding check in emulator From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Use DR7_GD in the emulator instead of open coding the check, and drop a comically wrong comment. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/emulate.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index bd9e9c5627d0..fdeccf446b28 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -4181,8 +4181,7 @@ static int check_dr7_gd(struct x86_emulate_ctxt *ctxt) ctxt->ops->get_dr(ctxt, 7, &dr7); - /* Check if DR7.Global_Enable is set */ - return dr7 & (1 << 13); + return dr7 & DR7_GD; } static int check_dr_read(struct x86_emulate_ctxt *ctxt) From patchwork Sat Jul 23 00:51:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927004 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD911C43334 for ; Sat, 23 Jul 2022 00:52:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236896AbiGWAwg (ORCPT ); Fri, 22 Jul 2022 20:52:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236654AbiGWAwA (ORCPT ); Fri, 22 Jul 2022 20:52:00 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29FB48F530 for ; Fri, 22 Jul 2022 17:51:55 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id k11-20020a170902ce0b00b0016a15fe2627so3443163plg.22 for ; Fri, 22 Jul 2022 17:51:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=nzU1XzUnvE/6/rujCQjEFEDszI5LMFfOZskmJLapFUU=; b=rwyEJY+R06LW0SE9ENmhj1dnegcwD3lH+JcfwdvFbMfKniXKroCaELMEMHeVlmmTcM IiTBsKM35kKpaW/WvhZgwlXl16dbEIuvC5pZIjyIn/W4hgiEe4ey3b98yEc0c4pxCYAX yuhjccpbTM7GpzKwJyF7DV0MHoWpZ98aLFZgJTB6fLC8TTG3/kb4i1Uva86+YqXqZvIi XwBSFuNussSDnk6FKO6xZ0Eo/jr6/mi7We6/iz+6t05ti0Q3IWqYVbmZsfB0Ieto2Sbd d1MqMan3aclEQYTkylhQMcS4BqlxSG68Hj4lINKaZNi1Mjx/EN7a2SgumszcyP7GQ2yc U7JA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=nzU1XzUnvE/6/rujCQjEFEDszI5LMFfOZskmJLapFUU=; b=dcy8M6cguDACTFTK3UNjp4Ajt65cZgsgzex/aK9+2Z9qiMxoXjJPDL+jv7inVCoTWh TQpBgFNPt+6KaLfQ4NGRQ6QGnpnhkeTJ0a2xtTK+kPLJ3GyrveGYYYJX1LcTMtVW5fmj iqsbiJDYfPqVF133dcTBo4s5yf1g8KrQtxN/vwd28USZRPFk0vZGp89bVwKevozwN9YQ oc/teGcrx7RSM/wHvrZXw2J0TYlPKnFI4u1taFmPT+3oDXsilZ3gsDQV1h03VTGheo7E ZB3EviG3iwfHyAiDbK3mJYczwbbBEhcnaVGs1tb9DXZjtKffeF0YKRvzvosO/ahVzgt+ 6qwQ== X-Gm-Message-State: AJIora8pgn4i22WZW3N6mrmsVmYJMqCxBHpmXzKSzqcxDVwOKQK+JE/o YchMct0BgJ3vW+A/mouEe6VoaQxVHIo= X-Google-Smtp-Source: AGRyM1vkgClrTi4iNH0srLgiv2Bw6T6Hr6KVH+F8Lz8n6WFnaJ8ZR8ghp/OylcC2I4Ebr3p6LQjCEgDldOo= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a62:16ce:0:b0:528:c4c7:972f with SMTP id 197-20020a6216ce000000b00528c4c7972fmr2456794pfw.55.1658537515264; Fri, 22 Jul 2022 17:51:55 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:21 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-9-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 08/24] KVM: nVMX: Ignore SIPI that arrives in L2 when vCPU is not in WFS From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Fall through to handling other pending exception/events for L2 if SIPI is pending while the CPU is not in Wait-for-SIPI. KVM correctly ignores the event, but incorrectly returns immediately, e.g. a SIPI coincident with another event could lead to KVM incorrectly routing the event to L1 instead of L2. Fixes: bf0cd88ce363 ("KVM: x86: emulate wait-for-SIPI and SIPI-VMExit") Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/nested.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 6b4368d96d9e..46ea7740bb9e 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3932,10 +3932,12 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) return -EBUSY; clear_bit(KVM_APIC_SIPI, &apic->pending_events); - if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) + if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) { nested_vmx_vmexit(vcpu, EXIT_REASON_SIPI_SIGNAL, 0, apic->sipi_vector & 0xFFUL); - return 0; + return 0; + } + /* Fallthrough, the SIPI is completely ignored. */ } /* From patchwork Sat Jul 23 00:51:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33434C433EF for ; Sat, 23 Jul 2022 00:52:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236581AbiGWAwl (ORCPT ); Fri, 22 Jul 2022 20:52:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57538 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236734AbiGWAwM (ORCPT ); Fri, 22 Jul 2022 20:52:12 -0400 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B4AE98218 for ; Fri, 22 Jul 2022 17:51:57 -0700 (PDT) Received: by mail-pg1-x54a.google.com with SMTP id h13-20020a63e14d000000b0040df75eaa2eso3038829pgk.21 for ; Fri, 22 Jul 2022 17:51:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=64xZYpakxCJygEkjp2mVBRUvoCiAstWPQbss8n4nWQQ=; b=lD6PrwKpPBjyRkCVqSHLPud1g5OSqjgEDHaV7xZjneDM444GsAN05jigdCvSELw/Xj 0GFlfbXNINvLzye4WSMI6K1qvyMhD3x3ft9mSkDSkuSaKS1UedTyhr8SSgMAN1BYaD/S 0NK/Xo9K4QR84G3AASKMl/WUzsUGbcDrF1D1pwe36B+dXG5nsyxtDyfwz3q8Tl9qgTsv OmUgwCyoCDWJj/fgsgtHc73tUJvuGPLW903Of4Fi6VeH4ei0bK1zzZ5YiV8LyMrKz99u /RCZ2d8VwYZaj1K5IWYiw92Go0xngDmaWfJvmMotq0DZimiI0Dk+JeQgZpMKe1uKdjTZ +E4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=64xZYpakxCJygEkjp2mVBRUvoCiAstWPQbss8n4nWQQ=; b=OWLGO9B9g+JnczAGSVzPbd2jSgGzvIIgJt+43cgs3qiRH8QWVBbDoZLSHTKU2f9/Fp +R+hbqbZoc/cdKjl0nIa4t53dddDPi2F/VYQElOMTR9cr8q3sRHVab+FzbIL2BEXGBna FGVXB9pf27pXsgGuYUp6OwIQ+Au+pIJOVKqBFZzzmo5voNOcvILc5AttU8JsswmRopWc KS7DEH6vfWIfO0/kIQXH4wlc6GCkH9ZZhl0LGXWo8KrpY+FmCoD0MGfBkVd98WlrAK+U 6kIf7fSNKPS2CB4HM6wILBhejsxMNh6z++/HJevvshPW0lWvEMRMfFfzApzxuf+zyKYh k7rQ== X-Gm-Message-State: AJIora8nMrRlnB6LbJWPLQvVgyv7PqzrUqp4GB7OkrSwOO1vu41rY6SP yAPSauBabA9QB18h5s6bzrk3ljt8rlA= X-Google-Smtp-Source: AGRyM1vVZtVjOA7RgzIXwL/6Lnd83J/KU3pLnEqZ90Yg45eGPWngP27LdiXPsg3m09SAE0osu/mQ8oq3QJg= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:b70a:b0:16c:f62c:43aa with SMTP id d10-20020a170902b70a00b0016cf62c43aamr2359950pls.8.1658537516739; Fri, 22 Jul 2022 17:51:56 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:22 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-10-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 09/24] KVM: nVMX: Unconditionally clear mtf_pending on nested VM-Exit From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Clear mtf_pending on nested VM-Exit instead of handling the clear on a case-by-case basis in vmx_check_nested_events(). The pending MTF should never survive nested VM-Exit, as it is a property of KVM's run of the current L2, i.e. should never affect the next L2 run by L1. In practice, this is likely a nop as getting to L1 with nested_run_pending is impossible, and KVM doesn't correctly handle morphing a pending exception that occurs on a prior injected exception (need for re-injected exception being the other case where MTF isn't cleared). However, KVM will hopefully soon correctly deal with a pending exception on top of an injected exception. Add a TODO to document that KVM has an inversion priority bug between SMIs and MTF (and trap-like #DBS), and that KVM also doesn't properly save/restore MTF across SMI/RSM. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/nested.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 46ea7740bb9e..17df0c31f0b5 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3905,16 +3905,8 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) unsigned long exit_qual; bool block_nested_events = vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu); - bool mtf_pending = vmx->nested.mtf_pending; struct kvm_lapic *apic = vcpu->arch.apic; - /* - * Clear the MTF state. If a higher priority VM-exit is delivered first, - * this state is discarded. - */ - if (!block_nested_events) - vmx->nested.mtf_pending = false; - if (lapic_in_kernel(vcpu) && test_bit(KVM_APIC_INIT, &apic->pending_events)) { if (block_nested_events) @@ -3923,6 +3915,9 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) clear_bit(KVM_APIC_INIT, &apic->pending_events); if (vcpu->arch.mp_state != KVM_MP_STATE_INIT_RECEIVED) nested_vmx_vmexit(vcpu, EXIT_REASON_INIT_SIGNAL, 0, 0); + + /* MTF is discarded if the vCPU is in WFS. */ + vmx->nested.mtf_pending = false; return 0; } @@ -3945,6 +3940,11 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) * fault-like exceptions, TSS T flag #DB (not emulated by KVM, but * could theoretically come in from userspace), and ICEBP (INT1). * + * TODO: SMIs have higher priority than MTF and trap-like #DBs (except + * for TSS T flag #DBs). KVM also doesn't save/restore pending MTF + * across SMI/RSM as it should; that needs to be addressed in order to + * prioritize SMI over MTF and trap-like #DBs. + * * Note that only a pending nested run can block a pending exception. * Otherwise an injected NMI/interrupt should either be * lost or delivered to the nested hypervisor in the IDT_VECTORING_INFO, @@ -3960,7 +3960,7 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) return 0; } - if (mtf_pending) { + if (vmx->nested.mtf_pending) { if (block_nested_events) return -EBUSY; nested_vmx_update_pending_dbg(vcpu); @@ -4557,6 +4557,9 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason, struct vcpu_vmx *vmx = to_vmx(vcpu); struct vmcs12 *vmcs12 = get_vmcs12(vcpu); + /* Pending MTF traps are discarded on VM-Exit. */ + vmx->nested.mtf_pending = false; + /* trying to cancel vmlaunch/vmresume is a bug */ WARN_ON_ONCE(vmx->nested.nested_run_pending); From patchwork Sat Jul 23 00:51:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927005 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CD61C433EF for ; Sat, 23 Jul 2022 00:52:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236379AbiGWAwj (ORCPT ); Fri, 22 Jul 2022 20:52:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236739AbiGWAwM (ORCPT ); Fri, 22 Jul 2022 20:52:12 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1703389672 for ; Fri, 22 Jul 2022 17:51:59 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id o7-20020a170902d4c700b0016d2f7f9a6bso3431791plg.13 for ; Fri, 22 Jul 2022 17:51:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=Dh5XhG/onjnmd2PjNqHEDq5Y/Izy1DZsJYFMaDP2/WM=; b=o7+HUcUvi+SlVGBX7T1V29ZcIIQcy3ldwgFEeUnF7/UnKremL5OXo1ztYxooVx26tv dug/4l8lJSV5pyHs+iAUZU4G0jtnel1nt+vAhmjbFTVVil4+ZOFHKr4ycIJe8W8/kbZt K5ss0ld/kKmvi7d2BdxW2aQZGvXfqpfaDQ5WAeiMw6xy8Mw91B7zoTeKfkrG4gWdy8G4 d6d29agos2Z44C+9qkstFdMtc6Po/uZolJgE05Qx0ovwaEMU01BZ3teDC0ZavAoqYQYo 68vH7w+8BBEQFAYBf/c/WXZ5l0J+baNoN/F+BWyxIVG3Eb49XFRjUhwcZmrp27Kl830T GkPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=Dh5XhG/onjnmd2PjNqHEDq5Y/Izy1DZsJYFMaDP2/WM=; b=2Ijmn2pKM5Bkr6kzjaivDYqY1oBiqeEbqRzN+GqbZYGyq538fS1zXPRgo0AocHhb2f g08pHNgOdycWJnYPf6b8FHQReSOppgkqMR6CLNcV3gXCcUy52pwMqZyo2kbHyu43Zqt7 wHgP4vQX+fQiaJaFUex4IRO+J70GXU0z1PkadYI+RRrB0xzloQktKqHAlctwvsU9+gZU gfIbCWc2iVXlvAMMoI+JIg0rcHl3sX6VmqGDCM5CMs5aBRs30ueQm62CzWqf5P2rxHFb G/DJr46CnjlhdDqOz2ojs7CaroxtNpBHZrkEs2Z/XrptOf+CfIVkprix6CnZLnhuadKY FYeg== X-Gm-Message-State: AJIora9zzRQlLIy7mBKMsx5FR22O/YXurhUoN6UPbEaCfEJfVsKqAkr2 Co63P8iI/tat+9yVoTMD0kiT7X1jRbc= X-Google-Smtp-Source: AGRyM1sZ46Ujgo1qO3MVUIDhgt+56oBD91Ennuxg6aLoZOAqeWPJDDVcviYC4X9L6A2CkLxH5cAn/Fk4UeY= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:1211:0:b0:419:a2dd:63ea with SMTP id h17-20020a631211000000b00419a2dd63eamr2058770pgl.416.1658537518202; Fri, 22 Jul 2022 17:51:58 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:23 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-11-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 10/24] KVM: VMX: Inject #PF on ENCLS as "emulated" #PF From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Treat #PFs that occur during emulation of ENCLS as, wait for it, emulated page faults. Practically speaking, this is a glorified nop as the exception is never of the nested flavor, and it's extremely unlikely the guest is relying on the side effect of an implicit INVLPG on the faulting address. Fixes: 70210c044b4e ("KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions") Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/sgx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c index aba8cebdc587..8f95c7c01433 100644 --- a/arch/x86/kvm/vmx/sgx.c +++ b/arch/x86/kvm/vmx/sgx.c @@ -129,7 +129,7 @@ static int sgx_inject_fault(struct kvm_vcpu *vcpu, gva_t gva, int trapnr) ex.address = gva; ex.error_code_valid = true; ex.nested_page_fault = false; - kvm_inject_page_fault(vcpu, &ex); + kvm_inject_emulated_page_fault(vcpu, &ex); } else { kvm_inject_gp(vcpu, 0); } From patchwork Sat Jul 23 00:51:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927007 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 972F2CCA47C for ; Sat, 23 Jul 2022 00:52:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236724AbiGWAwn (ORCPT ); Fri, 22 Jul 2022 20:52:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57750 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236559AbiGWAwP (ORCPT ); Fri, 22 Jul 2022 20:52:15 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 741DC9B56C for ; Fri, 22 Jul 2022 17:52:00 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id k11-20020a170902ce0b00b0016a15fe2627so3443236plg.22 for ; Fri, 22 Jul 2022 17:52:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=IeaBQrfkdXNfKz+ipR/TDgY2PaD5JQz8z1t6X3mNpdI=; b=OnedRgo9/k6ckD+QBBSuuhgCzNxJLoP5EtLNRdxStOWiMj5OlJrWzfs2fOl1o//lHM wutgW3D11vyctiBleHtCDNvwHK2s1L7HtHe3BqIYveMJ3efwr+yxx4dKkkxX45H/tOUG bbA4UMUok6QCu9DRYZC8JRF9negZCfrQ9LcdHHDkmOk/9UoojUqs3mcKkWYeTRyM1K3p /SrFd7AHMXfhRA+8a6XdLJ0rNJKckx1IwcgaH1vJpYwqa1AUZ2Trc4Y66pvjopN7+CTm IUMPPYSsFWfUnFc/9gC0b723CvKkszZoHwpO6nSVIEyocdjsPt+YKzKky6dFwLUR8mmr W1FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=IeaBQrfkdXNfKz+ipR/TDgY2PaD5JQz8z1t6X3mNpdI=; b=lRzx+wCPYQqFWHYeXbP1XTcQPyR/bTck7CeGxctNulmTKyurtEL3WA0IcTcLZaLPej uS+2/9J3JGdB/8qCAswH1C5bHix4jJEnkekvzIDoza9F1OkFTNCAHPnfTG30eUadTrif 5keB56kIM4LPd5X9V4cTYDjuL7GOTQBCqVuQj+3HsHg7rONDpqp7k3MtFnkfePPYtxbc x8q1FcKZzdDGaMPv+ZHIJtaTotd/DfrZUgFN59BXXqj+o1skLa0h0GVL1WiT96HR/XTu 4iPvt7K1XlDYAPi70FbOfUwxt/5XyzsVJscQrmMyJvmU144qnqgrSHsg83GQR0mpTMsY DvIw== X-Gm-Message-State: AJIora+7oYRss7wuaXNZhdTKlu+M3pmOqI8ux201X67wZda8/ihtjhIy QIFN3Z3oAZgXZ8nuR2HDQDT52S7Qb0Q= X-Google-Smtp-Source: AGRyM1sG3XsoD2E2Jp4QnSTv1DYXjewMRUeZEiULOQdZCOLYS5+iK56m4hOZ1Jg7tUUVgRFpbMq/qU4xtMI= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:8a41:0:b0:41a:4abd:5c98 with SMTP id y62-20020a638a41000000b0041a4abd5c98mr2049833pgd.292.1658537519981; Fri, 22 Jul 2022 17:51:59 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:24 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-12-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 11/24] KVM: x86: Rename kvm_x86_ops.queue_exception to inject_exception From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Rename the kvm_x86_ops hook for exception injection to better reflect reality, and to align with pretty much every other related function name in KVM. No functional change intended. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/include/asm/kvm-x86-ops.h | 2 +- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/svm/svm.c | 4 ++-- arch/x86/kvm/vmx/vmx.c | 4 ++-- arch/x86/kvm/x86.c | 2 +- 5 files changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 51f777071584..82ba4a564e58 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -67,7 +67,7 @@ KVM_X86_OP(get_interrupt_shadow) KVM_X86_OP(patch_hypercall) KVM_X86_OP(inject_irq) KVM_X86_OP(inject_nmi) -KVM_X86_OP(queue_exception) +KVM_X86_OP(inject_exception) KVM_X86_OP(cancel_injection) KVM_X86_OP(interrupt_allowed) KVM_X86_OP(nmi_allowed) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e8281d64a431..dbb9eab979d4 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1523,7 +1523,7 @@ struct kvm_x86_ops { unsigned char *hypercall_addr); void (*inject_irq)(struct kvm_vcpu *vcpu, bool reinjected); void (*inject_nmi)(struct kvm_vcpu *vcpu); - void (*queue_exception)(struct kvm_vcpu *vcpu); + void (*inject_exception)(struct kvm_vcpu *vcpu); void (*cancel_injection)(struct kvm_vcpu *vcpu); int (*interrupt_allowed)(struct kvm_vcpu *vcpu, bool for_injection); int (*nmi_allowed)(struct kvm_vcpu *vcpu, bool for_injection); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index aef63aae922d..e73d79ae0e45 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -454,7 +454,7 @@ static int svm_update_soft_interrupt_rip(struct kvm_vcpu *vcpu) return 0; } -static void svm_queue_exception(struct kvm_vcpu *vcpu) +static void svm_inject_exception(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); unsigned nr = vcpu->arch.exception.nr; @@ -4798,7 +4798,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .patch_hypercall = svm_patch_hypercall, .inject_irq = svm_inject_irq, .inject_nmi = svm_inject_nmi, - .queue_exception = svm_queue_exception, + .inject_exception = svm_inject_exception, .cancel_injection = svm_cancel_injection, .interrupt_allowed = svm_interrupt_allowed, .nmi_allowed = svm_nmi_allowed, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 1c72cde600d0..14f75e4003d3 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1610,7 +1610,7 @@ static void vmx_clear_hlt(struct kvm_vcpu *vcpu) vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE); } -static void vmx_queue_exception(struct kvm_vcpu *vcpu) +static void vmx_inject_exception(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); unsigned nr = vcpu->arch.exception.nr; @@ -7995,7 +7995,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .patch_hypercall = vmx_patch_hypercall, .inject_irq = vmx_inject_irq, .inject_nmi = vmx_inject_nmi, - .queue_exception = vmx_queue_exception, + .inject_exception = vmx_inject_exception, .cancel_injection = vmx_cancel_injection, .interrupt_allowed = vmx_interrupt_allowed, .nmi_allowed = vmx_nmi_allowed, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 68fb6393c96f..a61b6cbd7194 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9672,7 +9672,7 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu) if (vcpu->arch.exception.error_code && !is_protmode(vcpu)) vcpu->arch.exception.error_code = false; - static_call(kvm_x86_queue_exception)(vcpu); + static_call(kvm_x86_inject_exception)(vcpu); } static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit) From patchwork Sat Jul 23 00:51:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927008 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A8D7C43334 for ; Sat, 23 Jul 2022 00:52:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236972AbiGWAwu (ORCPT ); Fri, 22 Jul 2022 20:52:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236844AbiGWAwd (ORCPT ); Fri, 22 Jul 2022 20:52:33 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B1749DEF0 for ; Fri, 22 Jul 2022 17:52:02 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-31e55e88567so50775517b3.15 for ; Fri, 22 Jul 2022 17:52:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=sCKxuTRSU0swlld2lKTaE6x+f7GPwGIZPkFWUt1GRHw=; b=gyJSHtExmzmozRW+VD4m7BaKlHijgxpLppCFUNg47x23IkJj3Zf8mkumqpx9oSRnZa TUxJxH+zsciyRxMrovvfwTfrAkS643qNfJlLwAoS9gMBt+Hjx2qrdKc+2j3Xa/zFVUWh lhE0qYAa8h0sITCaBs6j+daO0P0I56085n2bMsP2X+OrEA4iwiUw/uyHQmazEADQN4Zg CDrgNC9KbIRJEv0T4RL+rEQTxq1c+nJMjJ9e8Qa/ffpHzi5Cczc821p8dhb+sUlVANaS gBRTM+PQiirzRnen40j4DIEcOQntC4PAVRSvEm7MOLiacKrZ6q0hPDYEHtK+LeB2gQgy 2GGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=sCKxuTRSU0swlld2lKTaE6x+f7GPwGIZPkFWUt1GRHw=; b=xjhk+V+a3qfBJ1cR/JIJ/BnqGcvxTa6CB8FVajLw0BKVPaDg7mqMTs0VkoRpiPzzei NMBKonNkpVg7dfflXh/5ODdg2AbS2YeVAYcbqKchjTc78ZEwWVliEMCTrCkVO+/1YleF mjhpwFtrDG44zUpPvDYuxoyPig42Hz/S+W1xHSju7DbxuUh634r3CwYizVtv8fwANlDz 8KRF5wzvGBcnJbgZrLGECgPATrNFTQtaN1VfRca7xl8/BYs/sm99TVyK9GY66+r+ZU2X S/dFbZUzPl+DrNJI6dI8n0eI3voQINneQHz3/CjnvqfIzqIolGB0kM9qRltWB9bOapFn bMXA== X-Gm-Message-State: AJIora+YORkHIaVEDoIq0Bq0FAaLVrcDhKVOx6XtWP8FjWNPuxLCNjgE doQXjZT3P/z5Ki7Qiezrymru5zEKP9Y= X-Google-Smtp-Source: AGRyM1uTuCaDz7MsiBtSLl6sqWTj8d0IL7mhGhSvWIZyvVlzB4ESoBgSoWzDy5dbRE7iUgioY116zefhJe8= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a5b:9c1:0:b0:66f:e294:f9a0 with SMTP id y1-20020a5b09c1000000b0066fe294f9a0mr2158913ybq.430.1658537521498; Fri, 22 Jul 2022 17:52:01 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:25 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-13-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 12/24] KVM: x86: Make kvm_queued_exception a properly named, visible struct From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Move the definition of "struct kvm_queued_exception" out of kvm_vcpu_arch in anticipation of adding a second instance in kvm_vcpu_arch to handle exceptions that occur when vectoring an injected exception and are morphed to VM-Exit instead of leading to #DF. Opportunistically take advantage of the churn to rename "nr" to "vector". No functional change intended. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/include/asm/kvm_host.h | 23 +++++----- arch/x86/kvm/svm/nested.c | 47 ++++++++++--------- arch/x86/kvm/svm/svm.c | 14 +++--- arch/x86/kvm/vmx/nested.c | 42 +++++++++-------- arch/x86/kvm/vmx/vmx.c | 20 ++++----- arch/x86/kvm/x86.c | 80 ++++++++++++++++----------------- arch/x86/kvm/x86.h | 3 +- 7 files changed, 113 insertions(+), 116 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index dbb9eab979d4..0a6a05e25f24 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -639,6 +639,17 @@ struct kvm_vcpu_xen { struct timer_list poll_timer; }; +struct kvm_queued_exception { + bool pending; + bool injected; + bool has_error_code; + u8 vector; + u32 error_code; + unsigned long payload; + bool has_payload; + u8 nested_apf; +}; + struct kvm_vcpu_arch { /* * rip and regs accesses must go through @@ -737,16 +748,8 @@ struct kvm_vcpu_arch { u8 event_exit_inst_len; - struct kvm_queued_exception { - bool pending; - bool injected; - bool has_error_code; - u8 nr; - u32 error_code; - unsigned long payload; - bool has_payload; - u8 nested_apf; - } exception; + /* Exceptions to be injected to the guest. */ + struct kvm_queued_exception exception; struct kvm_queued_interrupt { bool injected; diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 76dcc8a3e849..8f991592d277 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -468,7 +468,7 @@ static void nested_save_pending_event_to_vmcb12(struct vcpu_svm *svm, unsigned int nr; if (vcpu->arch.exception.injected) { - nr = vcpu->arch.exception.nr; + nr = vcpu->arch.exception.vector; exit_int_info = nr | SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_EXEPT; if (vcpu->arch.exception.has_error_code) { @@ -1306,42 +1306,45 @@ int nested_svm_check_permissions(struct kvm_vcpu *vcpu) static bool nested_exit_on_exception(struct vcpu_svm *svm) { - unsigned int nr = svm->vcpu.arch.exception.nr; + unsigned int vector = svm->vcpu.arch.exception.vector; - return (svm->nested.ctl.intercepts[INTERCEPT_EXCEPTION] & BIT(nr)); + return (svm->nested.ctl.intercepts[INTERCEPT_EXCEPTION] & BIT(vector)); } -static void nested_svm_inject_exception_vmexit(struct vcpu_svm *svm) +static void nested_svm_inject_exception_vmexit(struct kvm_vcpu *vcpu) { - unsigned int nr = svm->vcpu.arch.exception.nr; + struct kvm_queued_exception *ex = &vcpu->arch.exception; + struct vcpu_svm *svm = to_svm(vcpu); struct vmcb *vmcb = svm->vmcb; - vmcb->control.exit_code = SVM_EXIT_EXCP_BASE + nr; + vmcb->control.exit_code = SVM_EXIT_EXCP_BASE + ex->vector; vmcb->control.exit_code_hi = 0; - if (svm->vcpu.arch.exception.has_error_code) - vmcb->control.exit_info_1 = svm->vcpu.arch.exception.error_code; + if (ex->has_error_code) + vmcb->control.exit_info_1 = ex->error_code; /* * EXITINFO2 is undefined for all exception intercepts other * than #PF. */ - if (nr == PF_VECTOR) { - if (svm->vcpu.arch.exception.nested_apf) - vmcb->control.exit_info_2 = svm->vcpu.arch.apf.nested_apf_token; - else if (svm->vcpu.arch.exception.has_payload) - vmcb->control.exit_info_2 = svm->vcpu.arch.exception.payload; + if (ex->vector == PF_VECTOR) { + if (ex->nested_apf) + vmcb->control.exit_info_2 = vcpu->arch.apf.nested_apf_token; + else if (ex->has_payload) + vmcb->control.exit_info_2 = ex->payload; else - vmcb->control.exit_info_2 = svm->vcpu.arch.cr2; - } else if (nr == DB_VECTOR) { + vmcb->control.exit_info_2 = vcpu->arch.cr2; + } else if (ex->vector == DB_VECTOR) { /* See inject_pending_event. */ - kvm_deliver_exception_payload(&svm->vcpu); - if (svm->vcpu.arch.dr7 & DR7_GD) { - svm->vcpu.arch.dr7 &= ~DR7_GD; - kvm_update_dr7(&svm->vcpu); + kvm_deliver_exception_payload(vcpu, ex); + + if (vcpu->arch.dr7 & DR7_GD) { + vcpu->arch.dr7 &= ~DR7_GD; + kvm_update_dr7(vcpu); } - } else - WARN_ON(svm->vcpu.arch.exception.has_payload); + } else { + WARN_ON(ex->has_payload); + } nested_svm_vmexit(svm); } @@ -1379,7 +1382,7 @@ static int svm_check_nested_events(struct kvm_vcpu *vcpu) return -EBUSY; if (!nested_exit_on_exception(svm)) return 0; - nested_svm_inject_exception_vmexit(svm); + nested_svm_inject_exception_vmexit(vcpu); return 0; } diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index e73d79ae0e45..74cbe177e0d1 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -456,22 +456,20 @@ static int svm_update_soft_interrupt_rip(struct kvm_vcpu *vcpu) static void svm_inject_exception(struct kvm_vcpu *vcpu) { + struct kvm_queued_exception *ex = &vcpu->arch.exception; struct vcpu_svm *svm = to_svm(vcpu); - unsigned nr = vcpu->arch.exception.nr; - bool has_error_code = vcpu->arch.exception.has_error_code; - u32 error_code = vcpu->arch.exception.error_code; - kvm_deliver_exception_payload(vcpu); + kvm_deliver_exception_payload(vcpu, ex); - if (kvm_exception_is_soft(nr) && + if (kvm_exception_is_soft(ex->vector) && svm_update_soft_interrupt_rip(vcpu)) return; - svm->vmcb->control.event_inj = nr + svm->vmcb->control.event_inj = ex->vector | SVM_EVTINJ_VALID - | (has_error_code ? SVM_EVTINJ_VALID_ERR : 0) + | (ex->has_error_code ? SVM_EVTINJ_VALID_ERR : 0) | SVM_EVTINJ_TYPE_EXEPT; - svm->vmcb->control.event_inj_err = error_code; + svm->vmcb->control.event_inj_err = ex->error_code; } static void svm_init_erratum_383(void) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 17df0c31f0b5..0f5a7aec82a2 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -446,29 +446,27 @@ static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12, */ static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned long *exit_qual) { + struct kvm_queued_exception *ex = &vcpu->arch.exception; struct vmcs12 *vmcs12 = get_vmcs12(vcpu); - unsigned int nr = vcpu->arch.exception.nr; - bool has_payload = vcpu->arch.exception.has_payload; - unsigned long payload = vcpu->arch.exception.payload; - if (nr == PF_VECTOR) { - if (vcpu->arch.exception.nested_apf) { + if (ex->vector == PF_VECTOR) { + if (ex->nested_apf) { *exit_qual = vcpu->arch.apf.nested_apf_token; return 1; } - if (nested_vmx_is_page_fault_vmexit(vmcs12, - vcpu->arch.exception.error_code)) { - *exit_qual = has_payload ? payload : vcpu->arch.cr2; + if (nested_vmx_is_page_fault_vmexit(vmcs12, ex->error_code)) { + *exit_qual = ex->has_payload ? ex->payload : vcpu->arch.cr2; return 1; } - } else if (vmcs12->exception_bitmap & (1u << nr)) { - if (nr == DB_VECTOR) { - if (!has_payload) { - payload = vcpu->arch.dr6; - payload &= ~DR6_BT; - payload ^= DR6_ACTIVE_LOW; + } else if (vmcs12->exception_bitmap & (1u << ex->vector)) { + if (ex->vector == DB_VECTOR) { + if (ex->has_payload) { + *exit_qual = ex->payload; + } else { + *exit_qual = vcpu->arch.dr6; + *exit_qual &= ~DR6_BT; + *exit_qual ^= DR6_ACTIVE_LOW; } - *exit_qual = payload; } else *exit_qual = 0; return 1; @@ -3718,7 +3716,7 @@ static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu, is_double_fault(exit_intr_info))) { vmcs12->idt_vectoring_info_field = 0; } else if (vcpu->arch.exception.injected) { - nr = vcpu->arch.exception.nr; + nr = vcpu->arch.exception.vector; idt_vectoring = nr | VECTORING_INFO_VALID_MASK; if (kvm_exception_is_soft(nr)) { @@ -3822,11 +3820,11 @@ static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu) static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu, unsigned long exit_qual) { + struct kvm_queued_exception *ex = &vcpu->arch.exception; + u32 intr_info = ex->vector | INTR_INFO_VALID_MASK; struct vmcs12 *vmcs12 = get_vmcs12(vcpu); - unsigned int nr = vcpu->arch.exception.nr; - u32 intr_info = nr | INTR_INFO_VALID_MASK; - if (vcpu->arch.exception.has_error_code) { + if (ex->has_error_code) { /* * Intel CPUs do not generate error codes with bits 31:16 set, * and more importantly VMX disallows setting bits 31:16 in the @@ -3836,11 +3834,11 @@ static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu, * generate "full" 32-bit error codes, so KVM allows userspace * to inject exception error codes with bits 31:16 set. */ - vmcs12->vm_exit_intr_error_code = (u16)vcpu->arch.exception.error_code; + vmcs12->vm_exit_intr_error_code = (u16)ex->error_code; intr_info |= INTR_INFO_DELIVER_CODE_MASK; } - if (kvm_exception_is_soft(nr)) + if (kvm_exception_is_soft(ex->vector)) intr_info |= INTR_TYPE_SOFT_EXCEPTION; else intr_info |= INTR_TYPE_HARD_EXCEPTION; @@ -3871,7 +3869,7 @@ static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu, static inline unsigned long vmx_get_pending_dbg_trap(struct kvm_vcpu *vcpu) { if (!vcpu->arch.exception.pending || - vcpu->arch.exception.nr != DB_VECTOR) + vcpu->arch.exception.vector != DB_VECTOR) return 0; /* General Detect #DBs are always fault-like. */ diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 14f75e4003d3..4cfe6646476b 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1585,7 +1585,7 @@ static void vmx_update_emulated_instruction(struct kvm_vcpu *vcpu) */ if (nested_cpu_has_mtf(vmcs12) && (!vcpu->arch.exception.pending || - vcpu->arch.exception.nr == DB_VECTOR)) + vcpu->arch.exception.vector == DB_VECTOR)) vmx->nested.mtf_pending = true; else vmx->nested.mtf_pending = false; @@ -1612,15 +1612,13 @@ static void vmx_clear_hlt(struct kvm_vcpu *vcpu) static void vmx_inject_exception(struct kvm_vcpu *vcpu) { + struct kvm_queued_exception *ex = &vcpu->arch.exception; + u32 intr_info = ex->vector | INTR_INFO_VALID_MASK; struct vcpu_vmx *vmx = to_vmx(vcpu); - unsigned nr = vcpu->arch.exception.nr; - bool has_error_code = vcpu->arch.exception.has_error_code; - u32 error_code = vcpu->arch.exception.error_code; - u32 intr_info = nr | INTR_INFO_VALID_MASK; - kvm_deliver_exception_payload(vcpu); + kvm_deliver_exception_payload(vcpu, ex); - if (has_error_code) { + if (ex->has_error_code) { /* * Despite the error code being architecturally defined as 32 * bits, and the VMCS field being 32 bits, Intel CPUs and thus @@ -1631,21 +1629,21 @@ static void vmx_inject_exception(struct kvm_vcpu *vcpu) * the upper bits to avoid VM-Fail, losing information that * does't really exist is preferable to killing the VM. */ - vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, (u16)error_code); + vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, (u16)ex->error_code); intr_info |= INTR_INFO_DELIVER_CODE_MASK; } if (vmx->rmode.vm86_active) { int inc_eip = 0; - if (kvm_exception_is_soft(nr)) + if (kvm_exception_is_soft(ex->vector)) inc_eip = vcpu->arch.event_exit_inst_len; - kvm_inject_realmode_interrupt(vcpu, nr, inc_eip); + kvm_inject_realmode_interrupt(vcpu, ex->vector, inc_eip); return; } WARN_ON_ONCE(vmx->emulation_required); - if (kvm_exception_is_soft(nr)) { + if (kvm_exception_is_soft(ex->vector)) { vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, vmx->vcpu.arch.event_exit_inst_len); intr_info |= INTR_TYPE_SOFT_EXCEPTION; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a61b6cbd7194..027fc518ba75 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -557,16 +557,13 @@ static int exception_type(int vector) return EXCPT_FAULT; } -void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu) +void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu, + struct kvm_queued_exception *ex) { - unsigned nr = vcpu->arch.exception.nr; - bool has_payload = vcpu->arch.exception.has_payload; - unsigned long payload = vcpu->arch.exception.payload; - - if (!has_payload) + if (!ex->has_payload) return; - switch (nr) { + switch (ex->vector) { case DB_VECTOR: /* * "Certain debug exceptions may clear bit 0-3. The @@ -591,8 +588,8 @@ void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu) * So they need to be flipped for DR6. */ vcpu->arch.dr6 |= DR6_ACTIVE_LOW; - vcpu->arch.dr6 |= payload; - vcpu->arch.dr6 ^= payload & DR6_ACTIVE_LOW; + vcpu->arch.dr6 |= ex->payload; + vcpu->arch.dr6 ^= ex->payload & DR6_ACTIVE_LOW; /* * The #DB payload is defined as compatible with the 'pending @@ -603,12 +600,12 @@ void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu) vcpu->arch.dr6 &= ~BIT(12); break; case PF_VECTOR: - vcpu->arch.cr2 = payload; + vcpu->arch.cr2 = ex->payload; break; } - vcpu->arch.exception.has_payload = false; - vcpu->arch.exception.payload = 0; + ex->has_payload = false; + ex->payload = 0; } EXPORT_SYMBOL_GPL(kvm_deliver_exception_payload); @@ -647,17 +644,18 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu, vcpu->arch.exception.injected = false; } vcpu->arch.exception.has_error_code = has_error; - vcpu->arch.exception.nr = nr; + vcpu->arch.exception.vector = nr; vcpu->arch.exception.error_code = error_code; vcpu->arch.exception.has_payload = has_payload; vcpu->arch.exception.payload = payload; if (!is_guest_mode(vcpu)) - kvm_deliver_exception_payload(vcpu); + kvm_deliver_exception_payload(vcpu, + &vcpu->arch.exception); return; } /* to check exception */ - prev_nr = vcpu->arch.exception.nr; + prev_nr = vcpu->arch.exception.vector; if (prev_nr == DF_VECTOR) { /* triple fault -> shutdown */ kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); @@ -675,7 +673,7 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu, vcpu->arch.exception.pending = true; vcpu->arch.exception.injected = false; vcpu->arch.exception.has_error_code = true; - vcpu->arch.exception.nr = DF_VECTOR; + vcpu->arch.exception.vector = DF_VECTOR; vcpu->arch.exception.error_code = 0; vcpu->arch.exception.has_payload = false; vcpu->arch.exception.payload = 0; @@ -5006,25 +5004,24 @@ static int kvm_vcpu_ioctl_x86_set_mce(struct kvm_vcpu *vcpu, static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu, struct kvm_vcpu_events *events) { + struct kvm_queued_exception *ex = &vcpu->arch.exception; + process_nmi(vcpu); if (kvm_check_request(KVM_REQ_SMI, vcpu)) process_smi(vcpu); /* - * In guest mode, payload delivery should be deferred, - * so that the L1 hypervisor can intercept #PF before - * CR2 is modified (or intercept #DB before DR6 is - * modified under nVMX). Unless the per-VM capability, - * KVM_CAP_EXCEPTION_PAYLOAD, is set, we may not defer the delivery of - * an exception payload and handle after a KVM_GET_VCPU_EVENTS. Since we - * opportunistically defer the exception payload, deliver it if the - * capability hasn't been requested before processing a - * KVM_GET_VCPU_EVENTS. + * In guest mode, payload delivery should be deferred if the exception + * will be intercepted by L1, e.g. KVM should not modifying CR2 if L1 + * intercepts #PF, ditto for DR6 and #DBs. If the per-VM capability, + * KVM_CAP_EXCEPTION_PAYLOAD, is not set, userspace may or may not + * propagate the payload and so it cannot be safely deferred. Deliver + * the payload if the capability hasn't been requested. */ if (!vcpu->kvm->arch.exception_payload_enabled && - vcpu->arch.exception.pending && vcpu->arch.exception.has_payload) - kvm_deliver_exception_payload(vcpu); + ex->pending && ex->has_payload) + kvm_deliver_exception_payload(vcpu, ex); /* * The API doesn't provide the instruction length for software @@ -5032,26 +5029,25 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu, * isn't advanced, we should expect to encounter the exception * again. */ - if (kvm_exception_is_soft(vcpu->arch.exception.nr)) { + if (kvm_exception_is_soft(ex->vector)) { events->exception.injected = 0; events->exception.pending = 0; } else { - events->exception.injected = vcpu->arch.exception.injected; - events->exception.pending = vcpu->arch.exception.pending; + events->exception.injected = ex->injected; + events->exception.pending = ex->pending; /* * For ABI compatibility, deliberately conflate * pending and injected exceptions when * KVM_CAP_EXCEPTION_PAYLOAD isn't enabled. */ if (!vcpu->kvm->arch.exception_payload_enabled) - events->exception.injected |= - vcpu->arch.exception.pending; + events->exception.injected |= ex->pending; } - events->exception.nr = vcpu->arch.exception.nr; - events->exception.has_error_code = vcpu->arch.exception.has_error_code; - events->exception.error_code = vcpu->arch.exception.error_code; - events->exception_has_payload = vcpu->arch.exception.has_payload; - events->exception_payload = vcpu->arch.exception.payload; + events->exception.nr = ex->vector; + events->exception.has_error_code = ex->has_error_code; + events->exception.error_code = ex->error_code; + events->exception_has_payload = ex->has_payload; + events->exception_payload = ex->payload; events->interrupt.injected = vcpu->arch.interrupt.injected && !vcpu->arch.interrupt.soft; @@ -5123,7 +5119,7 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu, process_nmi(vcpu); vcpu->arch.exception.injected = events->exception.injected; vcpu->arch.exception.pending = events->exception.pending; - vcpu->arch.exception.nr = events->exception.nr; + vcpu->arch.exception.vector = events->exception.nr; vcpu->arch.exception.has_error_code = events->exception.has_error_code; vcpu->arch.exception.error_code = events->exception.error_code; vcpu->arch.exception.has_payload = events->exception_has_payload; @@ -9665,7 +9661,7 @@ int kvm_check_nested_events(struct kvm_vcpu *vcpu) static void kvm_inject_exception(struct kvm_vcpu *vcpu) { - trace_kvm_inj_exception(vcpu->arch.exception.nr, + trace_kvm_inj_exception(vcpu->arch.exception.vector, vcpu->arch.exception.has_error_code, vcpu->arch.exception.error_code, vcpu->arch.exception.injected); @@ -9737,12 +9733,12 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit) * describe the behavior of General Detect #DBs, which are * fault-like. They do _not_ set RF, a la code breakpoints. */ - if (exception_type(vcpu->arch.exception.nr) == EXCPT_FAULT) + if (exception_type(vcpu->arch.exception.vector) == EXCPT_FAULT) __kvm_set_rflags(vcpu, kvm_get_rflags(vcpu) | X86_EFLAGS_RF); - if (vcpu->arch.exception.nr == DB_VECTOR) { - kvm_deliver_exception_payload(vcpu); + if (vcpu->arch.exception.vector == DB_VECTOR) { + kvm_deliver_exception_payload(vcpu, &vcpu->arch.exception); if (vcpu->arch.dr7 & DR7_GD) { vcpu->arch.dr7 &= ~DR7_GD; kvm_update_dr7(vcpu); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 1926d2cb8e79..4147d27f9fbc 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -286,7 +286,8 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, int handle_ud(struct kvm_vcpu *vcpu); -void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu); +void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu, + struct kvm_queued_exception *ex); void kvm_vcpu_mtrr_init(struct kvm_vcpu *vcpu); u8 kvm_mtrr_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn); From patchwork Sat Jul 23 00:51:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927009 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90E0DC43334 for ; Sat, 23 Jul 2022 00:53:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236988AbiGWAxD (ORCPT ); Fri, 22 Jul 2022 20:53:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236851AbiGWAwe (ORCPT ); Fri, 22 Jul 2022 20:52:34 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 352C59FE38 for ; Fri, 22 Jul 2022 17:52:03 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id e11-20020a17090301cb00b0016c3375abd3so3409253plh.3 for ; Fri, 22 Jul 2022 17:52:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=U+ChLRWvMU/fIzZxpmNafRsKDwWA1g7ApLuxjnHhUZw=; b=lnvvi6aa7qkzYWZk5ceD+06rKpRg2vHyB1FYeqp/jy3gVBz7Gg5MWxOI5pwtUJUNgF NVGCj2O4ldzkSd/aYhj2n34ZlXaI5SOOfQxeez6k4YQRWHZU2Y1gUQzZFy298fusg7uU Q6XrVS0ST21JcyhFJqqcgFugswcXL5Q1o4PaRsCigS8gEp7xEA5aClFoEC14OiKM6V+h iDRMF8ktBgdsgReZTMKf7kknJYgcAhoChtVSSYOQ3/IuzraPrIqrD6Sgv+aR3zoCogSx 8Zy5AvfslloJ228W8iFuFjqf2NtageRNDCo4Owp8cj43X3RxmT4wG/6IzopxhGaNtoZm SMeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=U+ChLRWvMU/fIzZxpmNafRsKDwWA1g7ApLuxjnHhUZw=; b=2veuqIQ5X9QfRzNa+qUgB/xROHghpUOgCaYjEJTKIl7pfcDmbM7VNH5+p11lhGN2Xx B2FsbrqUMusX8Trq+Rjf66BpwCg70IN5Vlt3XKg9Fnz4PciNvruA6jUge2LkjnFZQY6H 4cwdO2ajidO08eOCaHJfga4XUermpvLTtzRZEsXm7lmIkRpvZAygNO43gjw1iMfxUoIi /PV9KuEDFKYccVZZ4+DsgskHNsd8FViTOqKnytgvqry7bRFPdQGkWJAS4DjQfFWrD1Gj FqlXQdVkOwdSyuCmr8gNVrR//IrtBq6JWuPP6iKFPTRpZAY/WQfhF+8UbVWGthIP3yw4 4hIQ== X-Gm-Message-State: AJIora/seCftRJSOmUyQ00EvdQdKJfqup+r+pW+jE57eWDPklMhulujK 1/Lj+YbkjsDeJXEel+vTWniabQc5/NY= X-Google-Smtp-Source: AGRyM1s771km4rIGBISQqnv/L1ZFT5Xnbynw2dphoKMl6hBPyElH4HxpmYrg266gy0UCvEi5DIuMCneei5g= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:2310:b0:16d:38b9:2c78 with SMTP id d16-20020a170903231000b0016d38b92c78mr2086921plh.122.1658537522968; Fri, 22 Jul 2022 17:52:02 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:26 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-14-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 13/24] KVM: x86: Formalize blocking of nested pending exceptions From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Capture nested_run_pending as block_pending_exceptions so that the logic of why exceptions are blocked only needs to be documented once instead of at every place that employs the logic. No functional change intended. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 26 ++++++++++++++++---------- arch/x86/kvm/vmx/nested.c | 29 ++++++++++++++++++----------- 2 files changed, 34 insertions(+), 21 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 8f991592d277..a6111392985c 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1356,10 +1356,22 @@ static inline bool nested_exit_on_init(struct vcpu_svm *svm) static int svm_check_nested_events(struct kvm_vcpu *vcpu) { - struct vcpu_svm *svm = to_svm(vcpu); - bool block_nested_events = - kvm_event_needs_reinjection(vcpu) || svm->nested.nested_run_pending; struct kvm_lapic *apic = vcpu->arch.apic; + struct vcpu_svm *svm = to_svm(vcpu); + /* + * Only a pending nested run blocks a pending exception. If there is a + * previously injected event, the pending exception occurred while said + * event was being delivered and thus needs to be handled. + */ + bool block_nested_exceptions = svm->nested.nested_run_pending; + /* + * New events (not exceptions) are only recognized at instruction + * boundaries. If an event needs reinjection, then KVM is handling a + * VM-Exit that occurred _during_ instruction execution; new events are + * blocked until the instruction completes. + */ + bool block_nested_events = block_nested_exceptions || + kvm_event_needs_reinjection(vcpu); if (lapic_in_kernel(vcpu) && test_bit(KVM_APIC_INIT, &apic->pending_events)) { @@ -1372,13 +1384,7 @@ static int svm_check_nested_events(struct kvm_vcpu *vcpu) } if (vcpu->arch.exception.pending) { - /* - * Only a pending nested run can block a pending exception. - * Otherwise an injected NMI/interrupt should either be - * lost or delivered to the nested hypervisor in the EXITINTINFO - * vmcb field, while delivering the pending exception. - */ - if (svm->nested.nested_run_pending) + if (block_nested_exceptions) return -EBUSY; if (!nested_exit_on_exception(svm)) return 0; diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 0f5a7aec82a2..361c788a73d5 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3899,11 +3899,23 @@ static bool nested_vmx_preemption_timer_pending(struct kvm_vcpu *vcpu) static int vmx_check_nested_events(struct kvm_vcpu *vcpu) { - struct vcpu_vmx *vmx = to_vmx(vcpu); - unsigned long exit_qual; - bool block_nested_events = - vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu); struct kvm_lapic *apic = vcpu->arch.apic; + struct vcpu_vmx *vmx = to_vmx(vcpu); + unsigned long exit_qual; + /* + * Only a pending nested run blocks a pending exception. If there is a + * previously injected event, the pending exception occurred while said + * event was being delivered and thus needs to be handled. + */ + bool block_nested_exceptions = vmx->nested.nested_run_pending; + /* + * New events (not exceptions) are only recognized at instruction + * boundaries. If an event needs reinjection, then KVM is handling a + * VM-Exit that occurred _during_ instruction execution; new events are + * blocked until the instruction completes. + */ + bool block_nested_events = block_nested_exceptions || + kvm_event_needs_reinjection(vcpu); if (lapic_in_kernel(vcpu) && test_bit(KVM_APIC_INIT, &apic->pending_events)) { @@ -3942,15 +3954,10 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) * for TSS T flag #DBs). KVM also doesn't save/restore pending MTF * across SMI/RSM as it should; that needs to be addressed in order to * prioritize SMI over MTF and trap-like #DBs. - * - * Note that only a pending nested run can block a pending exception. - * Otherwise an injected NMI/interrupt should either be - * lost or delivered to the nested hypervisor in the IDT_VECTORING_INFO, - * while delivering the pending exception. */ if (vcpu->arch.exception.pending && !(vmx_get_pending_dbg_trap(vcpu) & ~DR6_BT)) { - if (vmx->nested.nested_run_pending) + if (block_nested_exceptions) return -EBUSY; if (!nested_vmx_check_exception(vcpu, &exit_qual)) goto no_vmexit; @@ -3967,7 +3974,7 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) } if (vcpu->arch.exception.pending) { - if (vmx->nested.nested_run_pending) + if (block_nested_exceptions) return -EBUSY; if (!nested_vmx_check_exception(vcpu, &exit_qual)) goto no_vmexit; From patchwork Sat Jul 23 00:51:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927012 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 999AEC43334 for ; Sat, 23 Jul 2022 00:53:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237017AbiGWAxT (ORCPT ); Fri, 22 Jul 2022 20:53:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236851AbiGWAxD (ORCPT ); Fri, 22 Jul 2022 20:53:03 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A28F7C0B62 for ; Fri, 22 Jul 2022 17:52:16 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id m5-20020a170902f64500b0016d313f3ce7so3452378plg.23 for ; Fri, 22 Jul 2022 17:52:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=mcR+Y5Ms3FD3f/xG9LEGod80N5RagmS+RXoMsriitm0=; b=V+nmOzZuqsOlKd6plJHbruPMF+/9kPLRfbD63+P9Cw6xKImIo8kC5WrOvKdk5/CzLU 800B/sNoDp5+I8UWqao4K5wAG2eBghyU+hBpPFcCL+CP8/+G0nDSVhGn2WgSWQfyWylA 6NWkB2Cq5syMoNSsXoCDiSYLKnYjU4ljZ9BG0Q/i9IFRwwuoLtWTpmrq/KH1+hkYVRQe dSLxkYs1jyTkJkeYbbDqHev7TfAts5Ct+gGJEj8f2UbSuWhcO7H83ieleGjsb1xyapeT hD08Gr9paHlO9kJf+x2TdRkqJs+mjBQH+WyiQl2lfgu04ST0gAHPYCCdaMNVbTKL5BXL VcfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=mcR+Y5Ms3FD3f/xG9LEGod80N5RagmS+RXoMsriitm0=; b=0xM0BdaBJkqbGBa55q/nwjg78/CVSOFL6iEnMRSyr4n8hYsoqbvk8tc+LOArzsYQgJ 1XDeNmy/lUFJw100zzBQZ324A4KUumNtAs36c86Kasa1PZCgN8y9qivlfk4KE1nFXy7J ppI+z7M97dKkK0ZUyIBJEt56sNRjclYZD2nY1zIqzcowMLRpt1D8awqqvUddIjAwm5BR r8+n1XMQsbySIAixVDMYLcReK3AH1jUgOmiPDozaIful9xyQuJNmh1mGZcL3QvEbGgHO HBykd+qDdLC901W9n1YOV4kAFp0zlC3INR3GI3PjvNV7a4KlW/rN1+MqOKaPwdJfwgZk kAuw== X-Gm-Message-State: AJIora9OVZODTE5X0VFMIaNSfImSHgHIZXZZZjuC+skVgI5XUepfhbTN euN5UNwWdESFwI9kRL1mB5xdYomanBg= X-Google-Smtp-Source: AGRyM1vXVwTJjicio75jvInkCj/771zF90zgZIjhhu0yPsDC5mAnUe9nUC77Xd6SoVgnt31APPWz5bKHbb4= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90a:ba11:b0:1f2:38ec:3be3 with SMTP id s17-20020a17090aba1100b001f238ec3be3mr2302715pjr.177.1658537524319; Fri, 22 Jul 2022 17:52:04 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:27 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-15-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 14/24] KVM: x86: Use kvm_queue_exception_e() to queue #DF From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Queue #DF by recursing on kvm_multiple_exception() by way of kvm_queue_exception_e() instead of open coding the behavior. This will allow KVM to Just Work when a future commit moves exception interception checks (for L2 => L1) into kvm_multiple_exception(). No functional change intended. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/x86.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 027fc518ba75..041149c0cf02 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -663,25 +663,22 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu, } class1 = exception_class(prev_nr); class2 = exception_class(nr); - if ((class1 == EXCPT_CONTRIBUTORY && class2 == EXCPT_CONTRIBUTORY) - || (class1 == EXCPT_PF && class2 != EXCPT_BENIGN)) { + if ((class1 == EXCPT_CONTRIBUTORY && class2 == EXCPT_CONTRIBUTORY) || + (class1 == EXCPT_PF && class2 != EXCPT_BENIGN)) { /* - * Generate double fault per SDM Table 5-5. Set - * exception.pending = true so that the double fault - * can trigger a nested vmexit. + * Synthesize #DF. Clear the previously injected or pending + * exception so as not to incorrectly trigger shutdown. */ - vcpu->arch.exception.pending = true; vcpu->arch.exception.injected = false; - vcpu->arch.exception.has_error_code = true; - vcpu->arch.exception.vector = DF_VECTOR; - vcpu->arch.exception.error_code = 0; - vcpu->arch.exception.has_payload = false; - vcpu->arch.exception.payload = 0; - } else + vcpu->arch.exception.pending = false; + + kvm_queue_exception_e(vcpu, DF_VECTOR, 0); + } else { /* replace previous exception with a new one in a hope that instruction re-execution will regenerate lost exception */ goto queue; + } } void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr) From patchwork Sat Jul 23 00:51:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927013 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3FC3C433EF for ; Sat, 23 Jul 2022 00:53:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237041AbiGWAxW (ORCPT ); Fri, 22 Jul 2022 20:53:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236877AbiGWAxE (ORCPT ); Fri, 22 Jul 2022 20:53:04 -0400 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D0A98C5B2 for ; Fri, 22 Jul 2022 17:52:18 -0700 (PDT) Received: by mail-pj1-x1049.google.com with SMTP id i12-20020a17090a4b8c00b001f20db22239so2679714pjh.3 for ; Fri, 22 Jul 2022 17:52:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=JPv4psDn8PV+ci2PBj7FjbjAzoqVrpvE+Je8Q54BJ7I=; b=Nj676L/UeNiybqIUAekBWLwu6V5Cht3bgUW2/uzFmdxXIZ1PW+VqhNc1+8A03vqhld 0PZYVbED+wj65Nw8k2onCO77X+2oAqbiIl6tiUYFExNYc08qSiCxNlYHkaWyMccH1odq bPUGzSUDD20MnTJ+5tmZiMfrvLlSvfQJgDdjDzRhIgKROV+u/hrJIC8EYXQyp+tAAvMt Qhy2wqLc+VtQnLnFNk8fAjkHL395U8a3gOvGNMv2FVZTxAcRE53MiG8LwAaAe1C1qgXf e5ep8PVQkn7HLTuuuM6XyWUsaIoJM942xy2F6U7xGA8TclU9Fq1lwiFGxahAyx+khTo0 cXcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=JPv4psDn8PV+ci2PBj7FjbjAzoqVrpvE+Je8Q54BJ7I=; b=Ni/B6bItqwy/yvW1Y0hPdxmxMqb3+CZEvJoFFYdnDYzCKSaBOIjQ2znf9zzeyFP3f/ upFYTpISSRnWM+6zTc0uDuDQRj4SsZWdNCuJx/p0LLbNx+c77aH4pKtgG8kep8CKAqY5 TvI3B+ROaMjyXvsSAi8zFAyABH/iwtT/6FiVBtHgz/hLPY1JT6X621adtIVyblouAGf3 /Bm1MiNHjjoCIhYHTRsfkZdR5FIbOL8Cx9scIogCh/CPlUZq8p0/dp4ebHz8BLwHQBWa Ps8vJ2tR9rJm/iMv4Qsc4QV+gyhETFGvPJJ7vvQlnyM1oYv+y3IyyKal1r8tRYrBcjWE cmfw== X-Gm-Message-State: AJIora8RE+yYOivUzI0Ck8y1ota3O+JAODlR1v7FvxIruXkctOUaR1RP pvPrNyILTWwFWSoWPOaQfw5zVVIdLNA= X-Google-Smtp-Source: AGRyM1vKFMVzZW3Rv6phrC6lOHoVI2+C/bk07wSrayJWBhZVnDUwMZtBBvKQyqd7fgf11DtRvVin1eKImfc= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:5d45:0:b0:419:ab8f:c022 with SMTP id o5-20020a635d45000000b00419ab8fc022mr1892501pgm.557.1658537525684; Fri, 22 Jul 2022 17:52:05 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:28 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-16-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 15/24] KVM: x86: Hoist nested event checks above event injection logic From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Perform nested event checks before re-injecting exceptions/events into L2. If a pending exception causes VM-Exit to L1, re-injecting events into vmcs02 is premature and wasted effort. Take care to ensure events that need to be re-injected are still re-injected if checking for nested events "fails", i.e. if KVM needs to force an immediate entry+exit to complete the to-be-re-injecteed event. Keep the "can_inject" logic the same for now; it too can be pushed below the nested checks, but is a slightly riskier change (see past bugs about events not being properly purged on nested VM-Exit). Add and/or modify comments to better document the various interactions. Of note is the comment regarding "blocking" previously injected NMIs and IRQs if an exception is pending. The old comment isn't wrong strictly speaking, but it failed to capture the reason why the logic even exists. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/x86.c | 89 +++++++++++++++++++++++++++------------------- 1 file changed, 53 insertions(+), 36 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 041149c0cf02..046c8c2fbd8f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9670,53 +9670,70 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu) static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit) { + bool can_inject = !kvm_event_needs_reinjection(vcpu); int r; - bool can_inject = true; - /* try to reinject previous events if any */ + /* + * Process nested events first, as nested VM-Exit supercedes event + * re-injection. If there's an event queued for re-injection, it will + * be saved into the appropriate vmc{b,s}12 fields on nested VM-Exit. + */ + if (is_guest_mode(vcpu)) + r = kvm_check_nested_events(vcpu); + else + r = 0; - if (vcpu->arch.exception.injected) { + /* + * Re-inject exceptions and events *especially* if immediate entry+exit + * to/from L2 is needed, as any event that has already been injected + * into L2 needs to complete its lifecycle before injecting a new event. + * + * Don't re-inject an NMI or interrupt if there is a pending exception. + * This collision arises if an exception occurred while vectoring the + * injected event, KVM intercepted said exception, and KVM ultimately + * determined the fault belongs to the guest and queues the exception + * for injection back into the guest. + * + * "Injected" interrupts can also collide with pending exceptions if + * userspace ignores the "ready for injection" flag and blindly queues + * an interrupt. In that case, prioritizing the exception is correct, + * as the exception "occurred" before the exit to userspace. Trap-like + * exceptions, e.g. most #DBs, have higher priority than interrupts. + * And while fault-like exceptions, e.g. #GP and #PF, are the lowest + * priority, they're only generated (pended) during instruction + * execution, and interrupts are recognized at instruction boundaries. + * Thus a pending fault-like exception means the fault occurred on the + * *previous* instruction and must be serviced prior to recognizing any + * new events in order to fully complete the previous instruction. + */ + if (vcpu->arch.exception.injected) kvm_inject_exception(vcpu); - can_inject = false; - } + else if (vcpu->arch.exception.pending) + ; /* see above */ + else if (vcpu->arch.nmi_injected) + static_call(kvm_x86_inject_nmi)(vcpu); + else if (vcpu->arch.interrupt.injected) + static_call(kvm_x86_inject_irq)(vcpu, true); + /* - * Do not inject an NMI or interrupt if there is a pending - * exception. Exceptions and interrupts are recognized at - * instruction boundaries, i.e. the start of an instruction. - * Trap-like exceptions, e.g. #DB, have higher priority than - * NMIs and interrupts, i.e. traps are recognized before an - * NMI/interrupt that's pending on the same instruction. - * Fault-like exceptions, e.g. #GP and #PF, are the lowest - * priority, but are only generated (pended) during instruction - * execution, i.e. a pending fault-like exception means the - * fault occurred on the *previous* instruction and must be - * serviced prior to recognizing any new events in order to - * fully complete the previous instruction. + * Exceptions that morph to VM-Exits are handled above, and pending + * exceptions on top of injected exceptions that do not VM-Exit should + * either morph to #DF or, sadly, override the injected exception. */ - else if (!vcpu->arch.exception.pending) { - if (vcpu->arch.nmi_injected) { - static_call(kvm_x86_inject_nmi)(vcpu); - can_inject = false; - } else if (vcpu->arch.interrupt.injected) { - static_call(kvm_x86_inject_irq)(vcpu, true); - can_inject = false; - } - } - WARN_ON_ONCE(vcpu->arch.exception.injected && vcpu->arch.exception.pending); /* - * Call check_nested_events() even if we reinjected a previous event - * in order for caller to determine if it should require immediate-exit - * from L2 to L1 due to pending L1 events which require exit - * from L2 to L1. + * Bail if immediate entry+exit to/from the guest is needed to complete + * nested VM-Enter or event re-injection so that a different pending + * event can be serviced (or if KVM needs to exit to userspace). + * + * Otherwise, continue processing events even if VM-Exit occurred. The + * VM-Exit will have cleared exceptions that were meant for L2, but + * there may now be events that can be injected into L1. */ - if (is_guest_mode(vcpu)) { - r = kvm_check_nested_events(vcpu); - if (r < 0) - goto out; - } + if (r < 0) + goto out; /* try to inject new event if pending */ if (vcpu->arch.exception.pending) { From patchwork Sat Jul 23 00:51:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927014 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C04CC43334 for ; Sat, 23 Jul 2022 00:53:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237052AbiGWAxY (ORCPT ); Fri, 22 Jul 2022 20:53:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236686AbiGWAxH (ORCPT ); Fri, 22 Jul 2022 20:53:07 -0400 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C7631C198B for ; Fri, 22 Jul 2022 17:52:20 -0700 (PDT) Received: by mail-pj1-x104a.google.com with SMTP id 92-20020a17090a09e500b001d917022847so2611478pjo.1 for ; Fri, 22 Jul 2022 17:52:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=zf1eynWwF6SwWvKmGCaEsitzwxnpNnaxjN9o7uERh5g=; b=gVYbLiB34SPJY8UCNCCv3Rf2F3pQkuldFm+43P0MJ0KurE0TUCTdgwU2TCYZ/YHkXg EeqYYYNDTnEM2QfB76TshJzpMFOOUmSTazGFE1f9QKo/kVKBS/HBoI3Bez53HmNDUzwF Eb1vKH/RcDAZx6JIG+FuoiZhkpef6zY8zC1AKQ2+j7sDK8E4eeBKy8vs8QoENxVQ5zEV uqei9JvDti3GJ2+w1gr+5C0zd9MrkEBx8dl1n5QzWkhPIqP0zSCZIQfkpG+EoALGbY8h Cdi1HNvafvLUgZu7J9vbVSeT1UWm29ieff1O60DmIUFgbc2TARdjShimRFaGKOVBxEyn zEHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=zf1eynWwF6SwWvKmGCaEsitzwxnpNnaxjN9o7uERh5g=; b=bU6u4IfggEgTmRRlrz+aI8lQQtxPMBjnJZeIQnEbVgVq3yZglDCh2V68QCeHdxJM0f SZ8Te+X2qYuUeqBsZQRG/kHnqO2Ay9Xl9fCC8YUlxjio0/vp0FE+O71zUlNyRXsbskq2 Do9gOHfKwh5W0l5pxoMX6iWANCOUbEyBZccH6ZhTQhQHZjpk3VBT04G2Bvweb3WFkRCo Y/VHMYLTG3XGys0iI1BSr8e2AQs39SKZzSc5ez+ZM955abONON+DqhfKJoWQ1k/sgm81 cq/2th7hejgjWDEnghJ7svp8VGiiDvX3qUKOFvYmVWJYN+047Dcd4NvgTckBqb4ns7P0 KpNw== X-Gm-Message-State: AJIora87ES+QP2yOXdF/OEUyGs3Dzvt/x2wn4B+Bb2sZxszE/3mURqae Sl6UDpzu3FvnvqPtDYzMo/KgQRRAhqE= X-Google-Smtp-Source: AGRyM1tJVp9yOsLQRQ8KAjsSrzkqlug1qHo5gpI86daWUDzYqWInlZsRDmq4fCXmJ4FxoVVDpPm0yOCHrKw= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:993:b0:52a:dd93:f02d with SMTP id u19-20020a056a00099300b0052add93f02dmr2563317pfg.12.1658537527236; Fri, 22 Jul 2022 17:52:07 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:29 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-17-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 16/24] KVM: x86: Evaluate ability to inject SMI/NMI/IRQ after potential VM-Exit From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Determine whether or not new events can be injected after checking nested events. If a VM-Exit occurred during nested event handling, any previous event that needed re-injection is gone from's KVM perspective; the event is captured in the vmc*12 VM-Exit information, but doesn't exist in terms of what needs to be done for entry to L1. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/x86.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 046c8c2fbd8f..fb9edbb6ee1a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9670,7 +9670,7 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu) static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit) { - bool can_inject = !kvm_event_needs_reinjection(vcpu); + bool can_inject; int r; /* @@ -9735,7 +9735,13 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit) if (r < 0) goto out; - /* try to inject new event if pending */ + /* + * New events, other than exceptions, cannot be injected if KVM needs + * to re-inject a previous event. See above comments on re-injecting + * for why pending exceptions get priority. + */ + can_inject = !kvm_event_needs_reinjection(vcpu); + if (vcpu->arch.exception.pending) { /* * Fault-class exceptions, except #DBs, set RF=1 in the RFLAGS From patchwork Sat Jul 23 00:51:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927015 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C104ECCA473 for ; Sat, 23 Jul 2022 00:53:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237060AbiGWAx0 (ORCPT ); Fri, 22 Jul 2022 20:53:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57538 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237004AbiGWAxI (ORCPT ); Fri, 22 Jul 2022 20:53:08 -0400 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38399C1994 for ; Fri, 22 Jul 2022 17:52:21 -0700 (PDT) Received: by mail-pg1-x549.google.com with SMTP id e123-20020a636981000000b0041a3e675844so3039378pgc.23 for ; Fri, 22 Jul 2022 17:52:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=T7ZTAzFvlMNUytdgH9fHZ5hBLaLMdGvy2GiaJFqTamQ=; b=YkyM0tuSTGHGXWkmXGXxraYWNkICZoIIZ/iNxzXeej5x62mZNIzndSUh7y35R+tRQT yX+h/pIuM34hqyXRcWKNf1ZKTqoZwhg8hw6M3u9Km2yQ2Oexi+ex8qBUXWepRoBjE7Gc BiFdywACFdX8qvB7TRmqtU8tQ2+vtXYIem4s+zuDCZMhIoeG+1z44CoryhCPsIn4CkCS 0olu86UllNl4xe1DVS3ABIK1/pdAoNdqIuXk+mQ70lqaMc1pvtFuMgx7cIunQYj5qh9n BA/FmPAX+UsDTI6SAHC8qMgfxMEjTSeqb8TQTHp7toXcVDnZymLoqLwrr83h+y4FM+Uf 8pNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=T7ZTAzFvlMNUytdgH9fHZ5hBLaLMdGvy2GiaJFqTamQ=; b=phpXLyQHib9HhT2rkOxaDjxW5zun6fKxlTFzJESSALLhtnkFRfKnf6X0z1/NzN2xok 0K+cCY6yZ9vsBZe/JZL6J2VBtOnuVSL4rctGsls20DkoSJwt7Fod47865fK9Cr3vljoc 4+LWIdparuUqhR2BXiqQeWzftYTN1wZbyDKxRiNQXrCgQ54qFq4fYqq5ZhcbCiizCOVt 9dXtLYxuFZ0WxvBYjo9WHPpREputnyPwPfvJpOQwZLcahTHcKyP0exmPEvFBrGbZAphq GUgI9Ftgn210JLB4RdQWpVpo+ezQfQwO4oO3aapQrsrwGLv84mqAY2FQZznvudzOjCYP 5FKg== X-Gm-Message-State: AJIora/lwhAue01AzvwMVqLrl/V+I+JKLe5+F2TrfiGGf6K3o9kzw69N qeuB4+BuSabLF7/Um8DsBCNWS0B/yBw= X-Google-Smtp-Source: AGRyM1sg2xe3MgIUnLos3pSzqxTvjVbzTpT5+F8l9dkitvp9xzW9036R4xVmCrDZhp+w18d8uqJgKyejya0= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90a:887:b0:1f2:af6:4d20 with SMTP id v7-20020a17090a088700b001f20af64d20mr20342695pjc.190.1658537529073; Fri, 22 Jul 2022 17:52:09 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:30 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-18-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 17/24] KVM: nVMX: Add a helper to identify low-priority #DB traps From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a helper to identify "low"-priority #DB traps, i.e. trap-like #DBs that aren't TSS T flag #DBs, and tweak the related code to operate on any queued exception. A future commit will separate exceptions that are intercepted by L1, i.e. cause nested VM-Exit, from those that do NOT trigger nested VM-Exit. I.e. there will be multiple exception structs and multiple invocations of the helpers. No functional change intended. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky arch.exception.pending || - vcpu->arch.exception.vector != DB_VECTOR) + if (!ex->pending || ex->vector != DB_VECTOR) return 0; /* General Detect #DBs are always fault-like. */ - return vcpu->arch.exception.payload & ~DR6_BD; + return ex->payload & ~DR6_BD; +} + +/* + * Returns true if there's a pending #DB exception that is lower priority than + * a pending Monitor Trap Flag VM-Exit. TSS T-flag #DBs are not emulated by + * KVM, but could theoretically be injected by userspace. Note, this code is + * imperfect, see above. + */ +static bool vmx_is_low_priority_db_trap(struct kvm_queued_exception *ex) +{ + return vmx_get_pending_dbg_trap(ex) & ~DR6_BT; } /* @@ -3885,8 +3895,9 @@ static inline unsigned long vmx_get_pending_dbg_trap(struct kvm_vcpu *vcpu) */ static void nested_vmx_update_pending_dbg(struct kvm_vcpu *vcpu) { - unsigned long pending_dbg = vmx_get_pending_dbg_trap(vcpu); + unsigned long pending_dbg; + pending_dbg = vmx_get_pending_dbg_trap(&vcpu->arch.exception); if (pending_dbg) vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS, pending_dbg); } @@ -3956,7 +3967,7 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) * prioritize SMI over MTF and trap-like #DBs. */ if (vcpu->arch.exception.pending && - !(vmx_get_pending_dbg_trap(vcpu) & ~DR6_BT)) { + !vmx_is_low_priority_db_trap(&vcpu->arch.exception)) { if (block_nested_exceptions) return -EBUSY; if (!nested_vmx_check_exception(vcpu, &exit_qual)) From patchwork Sat Jul 23 00:51:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927010 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DF3EC43334 for ; Sat, 23 Jul 2022 00:53:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236940AbiGWAxO (ORCPT ); Fri, 22 Jul 2022 20:53:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236943AbiGWAwj (ORCPT ); Fri, 22 Jul 2022 20:52:39 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02F82C06E3 for ; Fri, 22 Jul 2022 17:52:10 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id c15-20020a170902d48f00b0016c01db365cso3453878plg.20 for ; Fri, 22 Jul 2022 17:52:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=fEpVGwQvu7cHsTd3X/MiBjAxI2NyMtEZka/ZEp33tvI=; b=bJVxa3OHd3htR9dyeudy/xCHRr9hgtMFgTYXVuLp4J+J3F2GGkyVWQezMy1/EeiBR2 1fUdG7yBcEsBakSXYe230z2ppIYhwvaxq5W+V3KIpwQIE9rmEKM1geIhPROO9p2nn/VG LhZP8xknWzZibZamSvQkxWdZ4za3ZNFJp4U68I2zOYzEhaqEzEMb9iUAgnOtSQZS++qV 7OtlxEMfQB4U7FIBPKZfwqOlpaBq1iC5SjBee0PGlGoeGhRLnXFMNm0bEx5D/kIBNkO6 UNq8tZe+ayaMbzg4pBTmMocHe/3ZXse0v2IWIK6SYFXFJTXyFlxy6t25xYPFgwVwckd1 PlGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=fEpVGwQvu7cHsTd3X/MiBjAxI2NyMtEZka/ZEp33tvI=; b=vur4ajwMTH2tJodwgYQqOkEqsc+SotuDIW8mg7ZPGqs3dNSlIKi1UaCl1D7XhEnV0P +YtlAnQ/LKf0MX8y4yBQyGpkj0glHgbNF+dHQy5L7hgR9O9uw+zEW1HNl7+aNzvEfNZd v2B9Y9AXxZ5GmG10cUyheI6wNma7jd63OWSuIT0ZgU7MzQCp1ckIyrTn3vLAUV0rxkdu pem9C0Q4/+6uYtObV1EDnHzA8C9tvASOpk3OcRUTBk52JE8r5VKSg80xKWJX2Y1P9uGP niN2cKOOrtcONUkVQ9Uy135D3LWomRrsJ5W8H7DIvZflZusJ1ksTjA0YWtzeZHrwV8Cr +eqw== X-Gm-Message-State: AJIora/NaKdHTwug2sUCmDQbFGsYL8Mbgj5w/SS8DuWf1RGHqyPEUgK9 41PgdARvSOIM+aOUuj2ExgzKXKri6Fs= X-Google-Smtp-Source: AGRyM1tCrxMu7u3LPVftX0y+EK7JzAc5x7hnJepesw+/YWe1P06hyyDCNR979cMoFcpmvEdyVHYFLTFSskY= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a62:1b57:0:b0:52a:d646:de3c with SMTP id b84-20020a621b57000000b0052ad646de3cmr2514471pfb.60.1658537530646; Fri, 22 Jul 2022 17:52:10 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:31 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-19-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 18/24] KVM: nVMX: Document priority of all known events on Intel CPUs From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a gigantic comment above vmx_check_nested_events() to document the priorities of all known events on Intel CPUs. Intel's SDM doesn't include VMX-specific events in its "Priority Among Concurrent Events", which makes it painfully difficult to suss out the correct priority between things like Monitor Trap Flag VM-Exits and pending #DBs. Kudos to Jim Mattson for doing the hard work of collecting and interpreting the priorities from various locations throughtout the SDM (because putting them all in one place in the SDM would be too easy). Cc: Jim Mattson Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/nested.c | 83 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 83 insertions(+) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 379a7b647ac7..3f6617cbf5d9 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3908,6 +3908,89 @@ static bool nested_vmx_preemption_timer_pending(struct kvm_vcpu *vcpu) to_vmx(vcpu)->nested.preemption_timer_expired; } +/* + * Per the Intel SDM's table "Priority Among Concurrent Events", with minor + * edits to fill in missing examples, e.g. #DB due to split-lock accesses, + * and less minor edits to splice in the priority of VMX Non-Root specific + * events, e.g. MTF and NMI/INTR-window exiting. + * + * 1 Hardware Reset and Machine Checks + * - RESET + * - Machine Check + * + * 2 Trap on Task Switch + * - T flag in TSS is set (on task switch) + * + * 3 External Hardware Interventions + * - FLUSH + * - STOPCLK + * - SMI + * - INIT + * + * 3.5 Monitor Trap Flag (MTF) VM-exit[1] + * + * 4 Traps on Previous Instruction + * - Breakpoints + * - Trap-class Debug Exceptions (#DB due to TF flag set, data/I-O + * breakpoint, or #DB due to a split-lock access) + * + * 4.3 VMX-preemption timer expired VM-exit + * + * 4.6 NMI-window exiting VM-exit[2] + * + * 5 Nonmaskable Interrupts (NMI) + * + * 5.5 Interrupt-window exiting VM-exit and Virtual-interrupt delivery + * + * 6 Maskable Hardware Interrupts + * + * 7 Code Breakpoint Fault + * + * 8 Faults from Fetching Next Instruction + * - Code-Segment Limit Violation + * - Code Page Fault + * - Control protection exception (missing ENDBRANCH at target of indirect + * call or jump) + * + * 9 Faults from Decoding Next Instruction + * - Instruction length > 15 bytes + * - Invalid Opcode + * - Coprocessor Not Available + * + *10 Faults on Executing Instruction + * - Overflow + * - Bound error + * - Invalid TSS + * - Segment Not Present + * - Stack fault + * - General Protection + * - Data Page Fault + * - Alignment Check + * - x86 FPU Floating-point exception + * - SIMD floating-point exception + * - Virtualization exception + * - Control protection exception + * + * [1] Per the "Monitor Trap Flag" section: System-management interrupts (SMIs), + * INIT signals, and higher priority events take priority over MTF VM exits. + * MTF VM exits take priority over debug-trap exceptions and lower priority + * events. + * + * [2] Debug-trap exceptions and higher priority events take priority over VM exits + * caused by the VMX-preemption timer. VM exits caused by the VMX-preemption + * timer take priority over VM exits caused by the "NMI-window exiting" + * VM-execution control and lower priority events. + * + * [3] Debug-trap exceptions and higher priority events take priority over VM exits + * caused by "NMI-window exiting". VM exits caused by this control take + * priority over non-maskable interrupts (NMIs) and lower priority events. + * + * [4] Virtual-interrupt delivery has the same priority as that of VM exits due to + * the 1-setting of the "interrupt-window exiting" VM-execution control. Thus, + * non-maskable interrupts (NMIs) and higher priority events take priority over + * delivery of a virtual interrupt; delivery of a virtual interrupt takes + * priority over external interrupts and lower priority events. + */ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) { struct kvm_lapic *apic = vcpu->arch.apic; From patchwork Sat Jul 23 00:51:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927022 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D467C43334 for ; Sat, 23 Jul 2022 00:53:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237098AbiGWAxj (ORCPT ); Fri, 22 Jul 2022 20:53:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236941AbiGWAxO (ORCPT ); Fri, 22 Jul 2022 20:53:14 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E17E389AB4 for ; Fri, 22 Jul 2022 17:52:23 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-31e559f6840so50446577b3.6 for ; Fri, 22 Jul 2022 17:52:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=D4NuN+fMqTWypgZwou/P67mZqvTtL5kh4ARQwWI8a1M=; b=fekN9ydNSSf4DOnS59cST4j2sl2ciCYkLM8uBn3AQQALIGMtN6Db4NzfsYdVx1TOk5 Rvempo6BPc/FYp1IygkR6zz1Dwzdp3ef6rgVzM2ovfNZxixyqJaLWVXs3hClsNM66e64 1BWgzy9C1bW2GNq8rAXTMmAdjPt19yudNFoLh9dizoabUjmmNWG4cYlochjN25kchOre GnRftYROD835weHuc9PJgcPZ2RodXatPTd6Rlpr35AnCeX+r63Re8foKSmjTyv7AMz3z O+NOM7LHmG5KdpPfqIu+jqmZecuvUcUOk+CBHw5pb8fdqQlHBGwYdlLPtO9pUSV5NgB4 djcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=D4NuN+fMqTWypgZwou/P67mZqvTtL5kh4ARQwWI8a1M=; b=IQjPACoh8aIn1e37sghiExX6iQakp+Gembyw1wvucNPLRS24wXovBWGjFJygGASJoS x8mcyBQugQyP5eL/UbJhJoHQeUBFvLb+y6O3m/+Nb/Kg42ZsTJ11mzU1hq/M5wSEqaDq ZUp0CU/tAzzNYtx1KMqvBWMZTOZoOhevITuq7FdEXeXmruUUf3/TC4ldbLCKUzN1mmyu Q8clnhz9UsNhO/9cUjNLA5nKSiqx2AojMfeGK20JI2369+n8evQ+j2JmZyUmjlSiLmZP mCETkE48myYVAerWV0ck+4l65DT8IfbgJy1XNL5M30EZDkhZi0GLpomyQgbg/m/ijw4u gcNg== X-Gm-Message-State: AJIora9PKmg7Ko/H12mWjXb912mz1MCmyWQemetBYTMUu+TF9Qbw4Kp1 ygKBLOxl+G7ArJ0vlNtXLQ0qYXPJ8Xs= X-Google-Smtp-Source: AGRyM1svvl2VJUNIIVrGAaMYT0m5YnUBDYXYYWy6fttZVngBOHPKelquCkGj2S7rjfvS0JIjahcwVGVX7M8= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a0d:d904:0:b0:31e:6072:3a94 with SMTP id b4-20020a0dd904000000b0031e60723a94mr2136619ywe.214.1658537532327; Fri, 22 Jul 2022 17:52:12 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:32 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-20-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 19/24] KVM: x86: Morph pending exceptions to pending VM-Exits at queue time From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Morph pending exceptions to pending VM-Exits (due to interception) when the exception is queued instead of waiting until nested events are checked at VM-Entry. This fixes a longstanding bug where KVM fails to handle an exception that occurs during delivery of a previous exception, KVM (L0) and L1 both want to intercept the exception (e.g. #PF for shadow paging), and KVM determines that the exception is in the guest's domain, i.e. queues the new exception for L2. Deferring the interception check causes KVM to esclate various combinations of injected+pending exceptions to double fault (#DF) without consulting L1's interception desires, and ends up injecting a spurious #DF into L2. KVM has fudged around the issue for #PF by special casing emulated #PF injection for shadow paging, but the underlying issue is not unique to shadow paging in L0, e.g. if KVM is intercepting #PF because the guest has a smaller maxphyaddr and L1 (but not L0) is using shadow paging. Other exceptions are affected as well, e.g. if KVM is intercepting #GP for one of SVM's workaround or for the VMware backdoor emulation stuff. The other cases have gone unnoticed because the #DF is spurious if and only if L1 resolves the exception, e.g. KVM's goofs go unnoticed if L1 would have injected #DF anyways. The hack-a-fix has also led to ugly code, e.g. bailing from the emulator if #PF injection forced a nested VM-Exit and the emulator finds itself back in L1. Allowing for direct-to-VM-Exit queueing also neatly solves the async #PF in L2 mess; no need to set a magic flag and token, simply queue a #PF nested VM-Exit. Deal with event migration by flagging that a pending exception was queued by userspace and check for interception at the next KVM_RUN, e.g. so that KVM does the right thing regardless of the order in which userspace restores nested state vs. event state. When "getting" events from userspace, simply drop any pending excpetion that is destined to be intercepted if there is also an injected exception to be migrated. Ideally, KVM would migrate both events, but that would require new ABI, and practically speaking losing the event is unlikely to be noticed, let alone fatal. The injected exception is captured, RIP still points at the original faulting instruction, etc... So either the injection on the target will trigger the same intercepted exception, or the source of the intercepted exception was transient and/or non-deterministic, thus dropping it is ok-ish. Fixes: a04aead144fd ("KVM: nSVM: fix running nested guests when npt=0") Fixes: feaf0c7dc473 ("KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2") Cc: Jim Mattson Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/include/asm/kvm_host.h | 12 ++- arch/x86/kvm/svm/nested.c | 45 +++------ arch/x86/kvm/vmx/nested.c | 109 ++++++++++------------ arch/x86/kvm/vmx/vmx.c | 6 +- arch/x86/kvm/x86.c | 159 ++++++++++++++++++++++---------- arch/x86/kvm/x86.h | 7 ++ 6 files changed, 188 insertions(+), 150 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0a6a05e25f24..6bcbffb42420 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -647,7 +647,6 @@ struct kvm_queued_exception { u32 error_code; unsigned long payload; bool has_payload; - u8 nested_apf; }; struct kvm_vcpu_arch { @@ -748,8 +747,12 @@ struct kvm_vcpu_arch { u8 event_exit_inst_len; + bool exception_from_userspace; + /* Exceptions to be injected to the guest. */ struct kvm_queued_exception exception; + /* Exception VM-Exits to be synthesized to L1. */ + struct kvm_queued_exception exception_vmexit; struct kvm_queued_interrupt { bool injected; @@ -860,7 +863,6 @@ struct kvm_vcpu_arch { u32 id; bool send_user_only; u32 host_apf_flags; - unsigned long nested_apf_token; bool delivery_as_pf_vmexit; bool pageready_pending; } apf; @@ -1636,9 +1638,9 @@ struct kvm_x86_ops { struct kvm_x86_nested_ops { void (*leave_nested)(struct kvm_vcpu *vcpu); + bool (*is_exception_vmexit)(struct kvm_vcpu *vcpu, u8 vector, + u32 error_code); int (*check_events)(struct kvm_vcpu *vcpu); - bool (*handle_page_fault_workaround)(struct kvm_vcpu *vcpu, - struct x86_exception *fault); bool (*hv_timer_pending)(struct kvm_vcpu *vcpu); void (*triple_fault)(struct kvm_vcpu *vcpu); int (*get_state)(struct kvm_vcpu *vcpu, @@ -1865,7 +1867,7 @@ void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr, unsigned long pay void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr); void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code); void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault); -bool kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu, +void kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault); bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl); bool kvm_require_dr(struct kvm_vcpu *vcpu, int dr); diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index a6111392985c..405075286965 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -55,28 +55,6 @@ static void nested_svm_inject_npf_exit(struct kvm_vcpu *vcpu, nested_svm_vmexit(svm); } -static bool nested_svm_handle_page_fault_workaround(struct kvm_vcpu *vcpu, - struct x86_exception *fault) -{ - struct vcpu_svm *svm = to_svm(vcpu); - struct vmcb *vmcb = svm->vmcb; - - WARN_ON(!is_guest_mode(vcpu)); - - if (vmcb12_is_intercept(&svm->nested.ctl, - INTERCEPT_EXCEPTION_OFFSET + PF_VECTOR) && - !WARN_ON_ONCE(svm->nested.nested_run_pending)) { - vmcb->control.exit_code = SVM_EXIT_EXCP_BASE + PF_VECTOR; - vmcb->control.exit_code_hi = 0; - vmcb->control.exit_info_1 = fault->error_code; - vmcb->control.exit_info_2 = fault->address; - nested_svm_vmexit(svm); - return true; - } - - return false; -} - static u64 nested_svm_get_tdp_pdptr(struct kvm_vcpu *vcpu, int index) { struct vcpu_svm *svm = to_svm(vcpu); @@ -1304,16 +1282,17 @@ int nested_svm_check_permissions(struct kvm_vcpu *vcpu) return 0; } -static bool nested_exit_on_exception(struct vcpu_svm *svm) +static bool nested_svm_is_exception_vmexit(struct kvm_vcpu *vcpu, u8 vector, + u32 error_code) { - unsigned int vector = svm->vcpu.arch.exception.vector; + struct vcpu_svm *svm = to_svm(vcpu); return (svm->nested.ctl.intercepts[INTERCEPT_EXCEPTION] & BIT(vector)); } static void nested_svm_inject_exception_vmexit(struct kvm_vcpu *vcpu) { - struct kvm_queued_exception *ex = &vcpu->arch.exception; + struct kvm_queued_exception *ex = &vcpu->arch.exception_vmexit; struct vcpu_svm *svm = to_svm(vcpu); struct vmcb *vmcb = svm->vmcb; @@ -1328,9 +1307,7 @@ static void nested_svm_inject_exception_vmexit(struct kvm_vcpu *vcpu) * than #PF. */ if (ex->vector == PF_VECTOR) { - if (ex->nested_apf) - vmcb->control.exit_info_2 = vcpu->arch.apf.nested_apf_token; - else if (ex->has_payload) + if (ex->has_payload) vmcb->control.exit_info_2 = ex->payload; else vmcb->control.exit_info_2 = vcpu->arch.cr2; @@ -1383,15 +1360,19 @@ static int svm_check_nested_events(struct kvm_vcpu *vcpu) return 0; } - if (vcpu->arch.exception.pending) { + if (vcpu->arch.exception_vmexit.pending) { if (block_nested_exceptions) return -EBUSY; - if (!nested_exit_on_exception(svm)) - return 0; nested_svm_inject_exception_vmexit(vcpu); return 0; } + if (vcpu->arch.exception.pending) { + if (block_nested_exceptions) + return -EBUSY; + return 0; + } + if (vcpu->arch.smi_pending && !svm_smi_blocked(vcpu)) { if (block_nested_events) return -EBUSY; @@ -1729,8 +1710,8 @@ static bool svm_get_nested_state_pages(struct kvm_vcpu *vcpu) struct kvm_x86_nested_ops svm_nested_ops = { .leave_nested = svm_leave_nested, + .is_exception_vmexit = nested_svm_is_exception_vmexit, .check_events = svm_check_nested_events, - .handle_page_fault_workaround = nested_svm_handle_page_fault_workaround, .triple_fault = nested_svm_triple_fault, .get_nested_state_pages = svm_get_nested_state_pages, .get_state = svm_get_nested_state, diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 3f6617cbf5d9..ec69e1573c3b 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -439,59 +439,22 @@ static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12, return inequality ^ bit; } - -/* - * KVM wants to inject page-faults which it got to the guest. This function - * checks whether in a nested guest, we need to inject them to L1 or L2. - */ -static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned long *exit_qual) -{ - struct kvm_queued_exception *ex = &vcpu->arch.exception; - struct vmcs12 *vmcs12 = get_vmcs12(vcpu); - - if (ex->vector == PF_VECTOR) { - if (ex->nested_apf) { - *exit_qual = vcpu->arch.apf.nested_apf_token; - return 1; - } - if (nested_vmx_is_page_fault_vmexit(vmcs12, ex->error_code)) { - *exit_qual = ex->has_payload ? ex->payload : vcpu->arch.cr2; - return 1; - } - } else if (vmcs12->exception_bitmap & (1u << ex->vector)) { - if (ex->vector == DB_VECTOR) { - if (ex->has_payload) { - *exit_qual = ex->payload; - } else { - *exit_qual = vcpu->arch.dr6; - *exit_qual &= ~DR6_BT; - *exit_qual ^= DR6_ACTIVE_LOW; - } - } else - *exit_qual = 0; - return 1; - } - - return 0; -} - -static bool nested_vmx_handle_page_fault_workaround(struct kvm_vcpu *vcpu, - struct x86_exception *fault) +static bool nested_vmx_is_exception_vmexit(struct kvm_vcpu *vcpu, u8 vector, + u32 error_code) { struct vmcs12 *vmcs12 = get_vmcs12(vcpu); - WARN_ON(!is_guest_mode(vcpu)); + /* + * Drop bits 31:16 of the error code when performing the #PF mask+match + * check. All VMCS fields involved are 32 bits, but Intel CPUs never + * set bits 31:16 and VMX disallows setting bits 31:16 in the injected + * error code. Including the to-be-dropped bits in the check might + * result in an "impossible" or missed exit from L1's perspective. + */ + if (vector == PF_VECTOR) + return nested_vmx_is_page_fault_vmexit(vmcs12, (u16)error_code); - if (nested_vmx_is_page_fault_vmexit(vmcs12, fault->error_code) && - !WARN_ON_ONCE(to_vmx(vcpu)->nested.nested_run_pending)) { - vmcs12->vm_exit_intr_error_code = fault->error_code; - nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, - PF_VECTOR | INTR_TYPE_HARD_EXCEPTION | - INTR_INFO_DELIVER_CODE_MASK | INTR_INFO_VALID_MASK, - fault->address); - return true; - } - return false; + return (vmcs12->exception_bitmap & (1u << vector)); } static int nested_vmx_check_io_bitmap_controls(struct kvm_vcpu *vcpu, @@ -3817,12 +3780,24 @@ static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu) return -ENXIO; } -static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu, - unsigned long exit_qual) +static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu) { - struct kvm_queued_exception *ex = &vcpu->arch.exception; + struct kvm_queued_exception *ex = &vcpu->arch.exception_vmexit; u32 intr_info = ex->vector | INTR_INFO_VALID_MASK; struct vmcs12 *vmcs12 = get_vmcs12(vcpu); + unsigned long exit_qual; + + if (ex->has_payload) { + exit_qual = ex->payload; + } else if (ex->vector == PF_VECTOR) { + exit_qual = vcpu->arch.cr2; + } else if (ex->vector == DB_VECTOR) { + exit_qual = vcpu->arch.dr6; + exit_qual &= ~DR6_BT; + exit_qual ^= DR6_ACTIVE_LOW; + } else { + exit_qual = 0; + } if (ex->has_error_code) { /* @@ -3995,7 +3970,6 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) { struct kvm_lapic *apic = vcpu->arch.apic; struct vcpu_vmx *vmx = to_vmx(vcpu); - unsigned long exit_qual; /* * Only a pending nested run blocks a pending exception. If there is a * previously injected event, the pending exception occurred while said @@ -4049,14 +4023,20 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) * across SMI/RSM as it should; that needs to be addressed in order to * prioritize SMI over MTF and trap-like #DBs. */ + if (vcpu->arch.exception_vmexit.pending && + !vmx_is_low_priority_db_trap(&vcpu->arch.exception_vmexit)) { + if (block_nested_exceptions) + return -EBUSY; + + nested_vmx_inject_exception_vmexit(vcpu); + return 0; + } + if (vcpu->arch.exception.pending && !vmx_is_low_priority_db_trap(&vcpu->arch.exception)) { if (block_nested_exceptions) return -EBUSY; - if (!nested_vmx_check_exception(vcpu, &exit_qual)) - goto no_vmexit; - nested_vmx_inject_exception_vmexit(vcpu, exit_qual); - return 0; + goto no_vmexit; } if (vmx->nested.mtf_pending) { @@ -4067,13 +4047,18 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) return 0; } + if (vcpu->arch.exception_vmexit.pending) { + if (block_nested_exceptions) + return -EBUSY; + + nested_vmx_inject_exception_vmexit(vcpu); + return 0; + } + if (vcpu->arch.exception.pending) { if (block_nested_exceptions) return -EBUSY; - if (!nested_vmx_check_exception(vcpu, &exit_qual)) - goto no_vmexit; - nested_vmx_inject_exception_vmexit(vcpu, exit_qual); - return 0; + goto no_vmexit; } if (nested_vmx_preemption_timer_pending(vcpu)) { @@ -6946,8 +6931,8 @@ __init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu *)) struct kvm_x86_nested_ops vmx_nested_ops = { .leave_nested = vmx_leave_nested, + .is_exception_vmexit = nested_vmx_is_exception_vmexit, .check_events = vmx_check_nested_events, - .handle_page_fault_workaround = nested_vmx_handle_page_fault_workaround, .hv_timer_pending = nested_vmx_preemption_timer_pending, .triple_fault = nested_vmx_triple_fault, .get_state = vmx_get_nested_state, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 4cfe6646476b..cf8877c545ce 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1585,7 +1585,9 @@ static void vmx_update_emulated_instruction(struct kvm_vcpu *vcpu) */ if (nested_cpu_has_mtf(vmcs12) && (!vcpu->arch.exception.pending || - vcpu->arch.exception.vector == DB_VECTOR)) + vcpu->arch.exception.vector == DB_VECTOR) && + (!vcpu->arch.exception_vmexit.pending || + vcpu->arch.exception_vmexit.vector == DB_VECTOR)) vmx->nested.mtf_pending = true; else vmx->nested.mtf_pending = false; @@ -5638,7 +5640,7 @@ static bool vmx_emulation_required_with_pending_exception(struct kvm_vcpu *vcpu) struct vcpu_vmx *vmx = to_vmx(vcpu); return vmx->emulation_required && !vmx->rmode.vm86_active && - (vcpu->arch.exception.pending || vcpu->arch.exception.injected); + (kvm_is_exception_pending(vcpu) || vcpu->arch.exception.injected); } static int handle_invalid_guest_state(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index fb9edbb6ee1a..6c77411fc570 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -609,6 +609,21 @@ void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu, } EXPORT_SYMBOL_GPL(kvm_deliver_exception_payload); +static void kvm_queue_exception_vmexit(struct kvm_vcpu *vcpu, unsigned int vector, + bool has_error_code, u32 error_code, + bool has_payload, unsigned long payload) +{ + struct kvm_queued_exception *ex = &vcpu->arch.exception_vmexit; + + ex->vector = vector; + ex->injected = false; + ex->pending = true; + ex->has_error_code = has_error_code; + ex->error_code = error_code; + ex->has_payload = has_payload; + ex->payload = payload; +} + static void kvm_multiple_exception(struct kvm_vcpu *vcpu, unsigned nr, bool has_error, u32 error_code, bool has_payload, unsigned long payload, bool reinject) @@ -618,18 +633,31 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu, kvm_make_request(KVM_REQ_EVENT, vcpu); + /* + * If the exception is destined for L2 and isn't being reinjected, + * morph it to a VM-Exit if L1 wants to intercept the exception. A + * previously injected exception is not checked because it was checked + * when it was original queued, and re-checking is incorrect if _L1_ + * injected the exception, in which case it's exempt from interception. + */ + if (!reinject && is_guest_mode(vcpu) && + kvm_x86_ops.nested_ops->is_exception_vmexit(vcpu, nr, error_code)) { + kvm_queue_exception_vmexit(vcpu, nr, has_error, error_code, + has_payload, payload); + return; + } + if (!vcpu->arch.exception.pending && !vcpu->arch.exception.injected) { queue: if (reinject) { /* - * On vmentry, vcpu->arch.exception.pending is only - * true if an event injection was blocked by - * nested_run_pending. In that case, however, - * vcpu_enter_guest requests an immediate exit, - * and the guest shouldn't proceed far enough to - * need reinjection. + * On VM-Entry, an exception can be pending if and only + * if event injection was blocked by nested_run_pending. + * In that case, however, vcpu_enter_guest() requests an + * immediate exit, and the guest shouldn't proceed far + * enough to need reinjection. */ - WARN_ON_ONCE(vcpu->arch.exception.pending); + WARN_ON_ONCE(kvm_is_exception_pending(vcpu)); vcpu->arch.exception.injected = true; if (WARN_ON_ONCE(has_payload)) { /* @@ -732,20 +760,22 @@ static int complete_emulated_insn_gp(struct kvm_vcpu *vcpu, int err) void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault) { ++vcpu->stat.pf_guest; - vcpu->arch.exception.nested_apf = - is_guest_mode(vcpu) && fault->async_page_fault; - if (vcpu->arch.exception.nested_apf) { - vcpu->arch.apf.nested_apf_token = fault->address; - kvm_queue_exception_e(vcpu, PF_VECTOR, fault->error_code); - } else { + + /* + * Async #PF in L2 is always forwarded to L1 as a VM-Exit regardless of + * whether or not L1 wants to intercept "regular" #PF. + */ + if (is_guest_mode(vcpu) && fault->async_page_fault) + kvm_queue_exception_vmexit(vcpu, PF_VECTOR, + true, fault->error_code, + true, fault->address); + else kvm_queue_exception_e_p(vcpu, PF_VECTOR, fault->error_code, fault->address); - } } EXPORT_SYMBOL_GPL(kvm_inject_page_fault); -/* Returns true if the page fault was immediately morphed into a VM-Exit. */ -bool kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu, +void kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault) { struct kvm_mmu *fault_mmu; @@ -763,26 +793,7 @@ bool kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu, kvm_mmu_invalidate_gva(vcpu, fault_mmu, fault->address, fault_mmu->root.hpa); - /* - * A workaround for KVM's bad exception handling. If KVM injected an - * exception into L2, and L2 encountered a #PF while vectoring the - * injected exception, manually check to see if L1 wants to intercept - * #PF, otherwise queuing the #PF will lead to #DF or a lost exception. - * In all other cases, defer the check to nested_ops->check_events(), - * which will correctly handle priority (this does not). Note, other - * exceptions, e.g. #GP, are theoretically affected, #PF is simply the - * most problematic, e.g. when L0 and L1 are both intercepting #PF for - * shadow paging. - * - * TODO: Rewrite exception handling to track injected and pending - * (VM-Exit) exceptions separately. - */ - if (unlikely(vcpu->arch.exception.injected && is_guest_mode(vcpu)) && - kvm_x86_ops.nested_ops->handle_page_fault_workaround(vcpu, fault)) - return true; - fault_mmu->inject_page_fault(vcpu, fault); - return false; } EXPORT_SYMBOL_GPL(kvm_inject_emulated_page_fault); @@ -4826,7 +4837,7 @@ static int kvm_vcpu_ready_for_interrupt_injection(struct kvm_vcpu *vcpu) return (kvm_arch_interrupt_allowed(vcpu) && kvm_cpu_accept_dm_intr(vcpu) && !kvm_event_needs_reinjection(vcpu) && - !vcpu->arch.exception.pending); + !kvm_is_exception_pending(vcpu)); } static int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, @@ -5001,13 +5012,27 @@ static int kvm_vcpu_ioctl_x86_set_mce(struct kvm_vcpu *vcpu, static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu, struct kvm_vcpu_events *events) { - struct kvm_queued_exception *ex = &vcpu->arch.exception; + struct kvm_queued_exception *ex; process_nmi(vcpu); if (kvm_check_request(KVM_REQ_SMI, vcpu)) process_smi(vcpu); + /* + * KVM's ABI only allows for one exception to be migrated. Luckily, + * the only time there can be two queued exceptions is if there's a + * non-exiting _injected_ exception, and a pending exiting exception. + * In that case, ignore the VM-Exiting exception as it's an extension + * of the injected exception. + */ + if (vcpu->arch.exception_vmexit.pending && + !vcpu->arch.exception.pending && + !vcpu->arch.exception.injected) + ex = &vcpu->arch.exception_vmexit; + else + ex = &vcpu->arch.exception; + /* * In guest mode, payload delivery should be deferred if the exception * will be intercepted by L1, e.g. KVM should not modifying CR2 if L1 @@ -5114,6 +5139,19 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu, return -EINVAL; process_nmi(vcpu); + + /* + * Flag that userspace is stuffing an exception, the next KVM_RUN will + * morph the exception to a VM-Exit if appropriate. Do this only for + * pending exceptions, already-injected exceptions are not subject to + * intercpetion. Note, userspace that conflates pending and injected + * is hosed, and will incorrectly convert an injected exception into a + * pending exception, which in turn may cause a spurious VM-Exit. + */ + vcpu->arch.exception_from_userspace = events->exception.pending; + + vcpu->arch.exception_vmexit.pending = false; + vcpu->arch.exception.injected = events->exception.injected; vcpu->arch.exception.pending = events->exception.pending; vcpu->arch.exception.vector = events->exception.nr; @@ -8136,18 +8174,17 @@ static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask) } } -static bool inject_emulated_exception(struct kvm_vcpu *vcpu) +static void inject_emulated_exception(struct kvm_vcpu *vcpu) { struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt; + if (ctxt->exception.vector == PF_VECTOR) - return kvm_inject_emulated_page_fault(vcpu, &ctxt->exception); - - if (ctxt->exception.error_code_valid) + kvm_inject_emulated_page_fault(vcpu, &ctxt->exception); + else if (ctxt->exception.error_code_valid) kvm_queue_exception_e(vcpu, ctxt->exception.vector, ctxt->exception.error_code); else kvm_queue_exception(vcpu, ctxt->exception.vector); - return false; } static struct x86_emulate_ctxt *alloc_emulate_ctxt(struct kvm_vcpu *vcpu) @@ -8760,8 +8797,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, if (ctxt->have_exception) { r = 1; - if (inject_emulated_exception(vcpu)) - return r; + inject_emulated_exception(vcpu); } else if (vcpu->arch.pio.count) { if (!vcpu->arch.pio.in) { /* FIXME: return into emulator if single-stepping. */ @@ -9708,7 +9744,7 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit) */ if (vcpu->arch.exception.injected) kvm_inject_exception(vcpu); - else if (vcpu->arch.exception.pending) + else if (kvm_is_exception_pending(vcpu)) ; /* see above */ else if (vcpu->arch.nmi_injected) static_call(kvm_x86_inject_nmi)(vcpu); @@ -9735,6 +9771,14 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit) if (r < 0) goto out; + /* + * A pending exception VM-Exit should either result in nested VM-Exit + * or force an immediate re-entry and exit to/from L2, and exception + * VM-Exits cannot be injected (flag should _never_ be set). + */ + WARN_ON_ONCE(vcpu->arch.exception_vmexit.injected || + vcpu->arch.exception_vmexit.pending); + /* * New events, other than exceptions, cannot be injected if KVM needs * to re-inject a previous event. See above comments on re-injecting @@ -9834,7 +9878,7 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit) kvm_x86_ops.nested_ops->hv_timer_pending(vcpu)) *req_immediate_exit = true; - WARN_ON(vcpu->arch.exception.pending); + WARN_ON(kvm_is_exception_pending(vcpu)); return 0; out: @@ -10852,6 +10896,7 @@ static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu) int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) { + struct kvm_queued_exception *ex = &vcpu->arch.exception; struct kvm_run *kvm_run = vcpu->run; int r; @@ -10910,6 +10955,21 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) } } + /* + * If userspace set a pending exception and L2 is active, convert it to + * a pending VM-Exit if L1 wants to intercept the exception. + */ + if (vcpu->arch.exception_from_userspace && is_guest_mode(vcpu) && + kvm_x86_ops.nested_ops->is_exception_vmexit(vcpu, ex->vector, + ex->error_code)) { + kvm_queue_exception_vmexit(vcpu, ex->vector, + ex->has_error_code, ex->error_code, + ex->has_payload, ex->payload); + ex->injected = false; + ex->pending = false; + } + vcpu->arch.exception_from_userspace = false; + if (unlikely(vcpu->arch.complete_userspace_io)) { int (*cui)(struct kvm_vcpu *) = vcpu->arch.complete_userspace_io; vcpu->arch.complete_userspace_io = NULL; @@ -11016,6 +11076,7 @@ static void __set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) kvm_set_rflags(vcpu, regs->rflags | X86_EFLAGS_FIXED); vcpu->arch.exception.pending = false; + vcpu->arch.exception_vmexit.pending = false; kvm_make_request(KVM_REQ_EVENT, vcpu); } @@ -11383,7 +11444,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, if (dbg->control & (KVM_GUESTDBG_INJECT_DB | KVM_GUESTDBG_INJECT_BP)) { r = -EBUSY; - if (vcpu->arch.exception.pending) + if (kvm_is_exception_pending(vcpu)) goto out; if (dbg->control & KVM_GUESTDBG_INJECT_DB) kvm_queue_exception(vcpu, DB_VECTOR); @@ -12567,7 +12628,7 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu) if (vcpu->arch.pv.pv_unhalted) return true; - if (vcpu->arch.exception.pending) + if (kvm_is_exception_pending(vcpu)) return true; if (kvm_test_request(KVM_REQ_NMI, vcpu) || @@ -12822,7 +12883,7 @@ bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu) { if (unlikely(!lapic_in_kernel(vcpu) || kvm_event_needs_reinjection(vcpu) || - vcpu->arch.exception.pending)) + kvm_is_exception_pending(vcpu))) return false; if (kvm_hlt_in_guest(vcpu->kvm) && !kvm_can_deliver_async_pf(vcpu)) diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 4147d27f9fbc..256745d1a2c3 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -82,10 +82,17 @@ static inline unsigned int __shrink_ple_window(unsigned int val, void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu); int kvm_check_nested_events(struct kvm_vcpu *vcpu); +static inline bool kvm_is_exception_pending(struct kvm_vcpu *vcpu) +{ + return vcpu->arch.exception.pending || + vcpu->arch.exception_vmexit.pending; +} + static inline void kvm_clear_exception_queue(struct kvm_vcpu *vcpu) { vcpu->arch.exception.pending = false; vcpu->arch.exception.injected = false; + vcpu->arch.exception_vmexit.pending = false; } static inline void kvm_queue_interrupt(struct kvm_vcpu *vcpu, u8 vector, From patchwork Sat Jul 23 00:51:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927011 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87995C433EF for ; Sat, 23 Jul 2022 00:53:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236981AbiGWAxR (ORCPT ); Fri, 22 Jul 2022 20:53:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59232 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236628AbiGWAxB (ORCPT ); Fri, 22 Jul 2022 20:53:01 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6DE66C0B4F for ; Fri, 22 Jul 2022 17:52:14 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-31e55e88567so50777367b3.15 for ; Fri, 22 Jul 2022 17:52:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=rUkhCIu3G6hVGxoVXeosOYZ1CHA90hebfQ0IThcWEiU=; b=dTZ9QzwbkjN8vSw1b4ZQ5CPuB06qdpj8gpR/RWuiDwzwolmGm5N2ePpnmceqHUUQJs 0nvprNHcU9JI67sqPhhJ9qYMnNEg4CfTCUK6ckj2Ym7QAmsGTMhbqy3/0S20Z9ptyE6h dnibPvvvHJTvRqyMuC/92cRS/ORmWU8z2WBFq0cuGvUvII05DIAE0nr8zIn6gwwE9lNh idyu6gTKtu66P6QDBbqi6NUYlABPgUfQlT05+0xU/VjFswiz8N3hoFx7amWMziwUaTpP u9j4Xt4a13fHwgW8fmnG2cyeia1vvNLOhKteYh3ZLw5di772IK5Aqkil4vV2VfoWVIjL OIRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=rUkhCIu3G6hVGxoVXeosOYZ1CHA90hebfQ0IThcWEiU=; b=Lugsfwp6NnFXnliI0+EIkIib1ka3dUjOkcS9npUXNiT6H7AjCzsn1airU5nzFU+6uv BxD86jALrcQB7EpEu8Ge0Wyd3RRSj2HrbUAeMlrY1WL6XqePNDeublo4/xyP6DRSkYpz 0IDtS6HO4EuPo48wPo545+kl7qVCmaTNVE3VGKDSKktoRmRD9l7gxoMTw49zL3PC4eaE 59Y7cRt7dZ00BAhZvWZcUjRPI5y8LQtmjhwD6HgFMqL7C1bHS1e9S7HRcwOjac7nhbJY LONNlInhhMoB4m3vODTvqw7v99BsCF1DZDvWFd+k1GvwRkfiGtsbHToXWlcVzUXaBRBp DPOw== X-Gm-Message-State: AJIora/fPLje5XOwaU7QFisrF+OTTdbV4yW9FjzpXIUZvb+0hkms4eAY KHBQTw2LCTKS1h2Vm8vZBL9WmTk1GvE= X-Google-Smtp-Source: AGRyM1tLKMklWgQ8FNSImnL/X0u9s1C+BqiyV2k5uUxPZxT/XciQjfwMQBolnnSWQj3XA+G8cMoIrTEuWQ4= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:b98c:0:b0:670:9213:4f8a with SMTP id r12-20020a25b98c000000b0067092134f8amr2136101ybg.171.1658537533909; Fri, 22 Jul 2022 17:52:13 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:33 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-21-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 20/24] KVM: x86: Treat pending TRIPLE_FAULT requests as pending exceptions From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Treat pending TRIPLE_FAULTS as pending exceptions. A triple fault is an exception for all intents and purposes, it's just not tracked as such because there's no vector associated the exception. E.g. if userspace were to set vcpu->request_interrupt_window while running L2 and L2 hit a triple fault, a triple fault nested VM-Exit should be synthesized to L1 before exiting to userspace with KVM_EXIT_IRQ_WINDOW_OPEN. Link: https://lore.kernel.org/all/YoVHAIGcFgJit1qp@google.com Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/x86.c | 3 --- arch/x86/kvm/x86.h | 3 ++- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6c77411fc570..d714f335749c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12657,9 +12657,6 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu) if (kvm_xen_has_pending_events(vcpu)) return true; - if (kvm_test_request(KVM_REQ_TRIPLE_FAULT, vcpu)) - return true; - return false; } diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 256745d1a2c3..a784ff90740b 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -85,7 +85,8 @@ int kvm_check_nested_events(struct kvm_vcpu *vcpu); static inline bool kvm_is_exception_pending(struct kvm_vcpu *vcpu) { return vcpu->arch.exception.pending || - vcpu->arch.exception_vmexit.pending; + vcpu->arch.exception_vmexit.pending || + kvm_test_request(KVM_REQ_TRIPLE_FAULT, vcpu); } static inline void kvm_clear_exception_queue(struct kvm_vcpu *vcpu) From patchwork Sat Jul 23 00:51:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927023 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A967BC433EF for ; Sat, 23 Jul 2022 00:53:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237134AbiGWAxr (ORCPT ); Fri, 22 Jul 2022 20:53:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236957AbiGWAxP (ORCPT ); Fri, 22 Jul 2022 20:53:15 -0400 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E97CC0B54 for ; Fri, 22 Jul 2022 17:52:26 -0700 (PDT) Received: by mail-pf1-x44a.google.com with SMTP id y37-20020a056a001ca500b00528bbf82c1eso2470377pfw.10 for ; Fri, 22 Jul 2022 17:52:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=pY4rxlForxQbmu/q/udq3u4PIKEqbChyIAtdlo0LuhQ=; b=jPrSuztZoQ+lv7DF55SpPwpb/t2LP0ggMpXTN+6d2s8+55zaysm3QQBP8Z3dzruo2o lREV46xx9I1bXRGbXpOL33TxjMZUuQG2e8zq/E0ToWO9e7NNFrYN+fBcWhIq2MusDfep GDOehkBROOUKx7WNbUh3ARt4u8Wmuu4FdWLqEy7RyeF4jGhh+7Ulkgwkl9Lh8McSE6MR uzFyKK7bMEbhavZyQsbty2jo61hTaji6uOs5ogQ1/oeD/v4onFOVoc/sl2YZj9XGryGu QKvvMM+5AyVnvxIbpc5xx54iTR4BNyZ/fT5YOaHBF7VFpj4I2kwGJ+VOClGI85TuwSEx B7YQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=pY4rxlForxQbmu/q/udq3u4PIKEqbChyIAtdlo0LuhQ=; b=4XADso21W0LoMydcoaYMWzoN5joB5NLj7JX3Zc+sBAjdfeQ/OHmJQmoaOUEcipKIXp LnRfJ5CZTkD91+9AjELQFIwVPXTqGr06iBrd4TjYuJy+0KUiZpngPHymviuTtgnMNsrj sjCT8/bdOf/latJapOX1bSPR/TCt1c6tq/NZYkegGNiLhsUkXdXLaYw8xVIb7jRohWS5 SD1HoErvfJWz8mvjIUw5VNwZeeByV6GEdFOylmvOhH+KMq+LohJTm33AZZzPmfit+a4G o2+QUmVVztYlvXxYsiYOJIgfIfETgoHlcMlZ0cDfFHsVRJjNZHunR+0QhU2pHk4ch9Fc U6BA== X-Gm-Message-State: AJIora+TwwfQ6ptULcoPxjzZmb0Qmlonz7As+gWUEAI//cnMJOBugIM8 mbBLy8xh9gK6KVQBMJnfJFsESatCoBA= X-Google-Smtp-Source: AGRyM1tZF7A/hmAoF/i10swfVyMGFVoUBG9EO5sdAqyF2KJRh9MXKffaX5v65KQb9NdY4WmE/8K/jDye1j4= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:b4d:b0:52a:f2cf:b0e4 with SMTP id p13-20020a056a000b4d00b0052af2cfb0e4mr2464187pfo.2.1658537535476; Fri, 22 Jul 2022 17:52:15 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:34 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-22-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 21/24] KVM: VMX: Update MTF and ICEBP comments to document KVM's subtle behavior From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Document the oddities of ICEBP interception (trap-like #DB is intercepted as a fault-like exception), and how using VMX's inner "skip" helper deliberately bypasses the pending MTF and single-step #DB logic. No functional change intended. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/vmx.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index cf8877c545ce..7864353f7547 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1578,9 +1578,13 @@ static void vmx_update_emulated_instruction(struct kvm_vcpu *vcpu) /* * Per the SDM, MTF takes priority over debug-trap exceptions besides - * T-bit traps. As instruction emulation is completed (i.e. at the - * instruction boundary), any #DB exception pending delivery must be a - * debug-trap. Record the pending MTF state to be delivered in + * TSS T-bit traps and ICEBP (INT1). KVM doesn't emulate T-bit traps + * or ICEBP (in the emulator proper), and skipping of ICEBP after an + * intercepted #DB deliberately avoids single-step #DB and MTF updates + * as ICEBP is higher priority than both. As instruction emulation is + * completed at this point (i.e. KVM is at the instruction boundary), + * any #DB exception pending delivery must be a debug-trap of lower + * priority than MTF. Record the pending MTF state to be delivered in * vmx_check_nested_events(). */ if (nested_cpu_has_mtf(vmcs12) && @@ -5085,8 +5089,10 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) * instruction. ICEBP generates a trap-like #DB, but * despite its interception control being tied to #DB, * is an instruction intercept, i.e. the VM-Exit occurs - * on the ICEBP itself. Note, skipping ICEBP also - * clears STI and MOVSS blocking. + * on the ICEBP itself. Use the inner "skip" helper to + * avoid single-step #DB and MTF updates, as ICEBP is + * higher priority. Note, skipping ICEBP still clears + * STI and MOVSS blocking. * * For all other #DBs, set vmcs.PENDING_DBG_EXCEPTIONS.BS * if single-step is enabled in RFLAGS and STI or MOVSS From patchwork Sat Jul 23 00:51:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927024 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D491C43334 for ; Sat, 23 Jul 2022 00:54:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236859AbiGWAyL (ORCPT ); Fri, 22 Jul 2022 20:54:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236986AbiGWAxS (ORCPT ); Fri, 22 Jul 2022 20:53:18 -0400 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EA97C1DDD for ; Fri, 22 Jul 2022 17:52:29 -0700 (PDT) Received: by mail-pj1-x104a.google.com with SMTP id f16-20020a17090a121000b001f22aa2ac88so2600167pja.7 for ; Fri, 22 Jul 2022 17:52:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=8QUcqYdas0eqDSvXYo4Sqhu3pNlKhGcIVM8bC4z4X4E=; b=sjhztStshfGUZjDU7uyDWl8wKUBQr+RDpjo1ZpkTve75pKSWyt00NUhxwC4ubI3hwv m0VPkRoleUb5rZXcYoQHOjO0RV6QpchamAIQ9PYcsNmzHtsy5H7oMOvDkwFcGyCmWCKS XJoDgXEHuS1V57Toaoc6+lOOPHFdnvMUV/0IFaw8rGfvzRlO2p19Wb0mL7MAUmy5TwgJ OkD7RaEqGX8O1AtHLTvxVZ6qRQXzjlRazM+gQZBGH7s/pg0IfIFkEDSXKsW3VLuVrf5b 26akg/dEbbCI2Iiv26BFVp5C5mWhPWTk6KzOuJOxL9bpiFADJKaNGvAXC6OFjBE0/l7r /fJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=8QUcqYdas0eqDSvXYo4Sqhu3pNlKhGcIVM8bC4z4X4E=; b=e+c1KYLo3N4Z1+sqoijLfdtgIguHtfXfTD3oU93BT6lC24gKliGA7nsa42g8AAfYKG oI0enWVc1PCZo7CQJmOFk5m6LJEKgduWFN86dSOeRKCD3qeKooAAEL+SBo+S8XLV8jay pRUYk0k73dfJxbjJ6p9ta1Ie+Zwf1K9HAYC5SSStbNSVEQniesnWwm8T+5ce9Yf+TQoF HpQ/KLV1DuXQVEznatUQdR3gnCg3mOa8Juw20pEssTaIUn+cJlu4DTFAxwwCKm40We8m 5B6oFYWYl/EE45uBirVewpzMxdj1y8iSLCrVvc6iS7J51dNUb7XcgqRILM0BMEmAdliE KC6g== X-Gm-Message-State: AJIora/bbuITqYzbCzELhfhVUXvnz10OX2ickvDiyPqAACq477WA1ls3 3X3BOhonISZImaegkD4oBWBDPeDAVyg= X-Google-Smtp-Source: AGRyM1sCJo9oUBudZTtbRG/LpT9zWMtTS9RJn5IyNWtkQJ3jRtzL3h6hKBJ6HXK7cUpLqhTRHmLb1/IUYaM= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:8602:b0:16c:dfae:9afb with SMTP id f2-20020a170902860200b0016cdfae9afbmr2430916plo.35.1658537536913; Fri, 22 Jul 2022 17:52:16 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:35 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-23-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 22/24] KVM: x86: Rename inject_pending_events() to kvm_check_and_inject_events() From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Rename inject_pending_events() to kvm_check_and_inject_events() in order to capture the fact that it handles more than just pending events, and to (mostly) align with kvm_check_nested_events(), which omits the "inject" for brevity. Add a comment above kvm_check_and_inject_events() to provide a high-level synopsis, and to document a virtualization hole (KVM erratum if you will) that exists due to KVM not strictly tracking instruction boundaries with respect to coincident instruction restarts and asynchronous events. No functional change inteded. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 2 +- arch/x86/kvm/svm/svm.c | 2 +- arch/x86/kvm/x86.c | 46 ++++++++++++++++++++++++++++++++++++--- 3 files changed, 45 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 405075286965..6b3b18404533 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1312,7 +1312,7 @@ static void nested_svm_inject_exception_vmexit(struct kvm_vcpu *vcpu) else vmcb->control.exit_info_2 = vcpu->arch.cr2; } else if (ex->vector == DB_VECTOR) { - /* See inject_pending_event. */ + /* See kvm_check_and_inject_events(). */ kvm_deliver_exception_payload(vcpu, ex); if (vcpu->arch.dr7 & DR7_GD) { diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 74cbe177e0d1..12e66e2114d1 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3520,7 +3520,7 @@ void svm_complete_interrupt_delivery(struct kvm_vcpu *vcpu, int delivery_mode, /* Note, this is called iff the local APIC is in-kernel. */ if (!READ_ONCE(vcpu->arch.apic->apicv_active)) { - /* Process the interrupt via inject_pending_event */ + /* Process the interrupt via kvm_check_and_inject_events(). */ kvm_make_request(KVM_REQ_EVENT, vcpu); kvm_vcpu_kick(vcpu); return; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index d714f335749c..4ed4811a7137 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9704,7 +9704,47 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu) static_call(kvm_x86_inject_exception)(vcpu); } -static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit) +/* + * Check for any event (interrupt or exception) that is ready to be injected, + * and if there is at least one event, inject the event with the highest + * priority. This handles both "pending" events, i.e. events that have never + * been injected into the guest, and "injected" events, i.e. events that were + * injected as part of a previous VM-Enter, but weren't successfully delivered + * and need to be re-injected. + * + * Note, this is not guaranteed to be invoked on a guest instruction boundary, + * i.e. doesn't guarantee that there's an event window in the guest. KVM must + * be able to inject exceptions in the "middle" of an instruction, and so must + * also be able to re-inject NMIs and IRQs in the middle of an instruction. + * I.e. for exceptions and re-injected events, NOT invoking this on instruction + * boundaries is necessary and correct. + * + * For simplicity, KVM uses a single path to inject all events (except events + * that are injected directly from L1 to L2) and doesn't explicitly track + * instruction boundaries for asynchronous events. However, because VM-Exits + * that can occur during instruction execution typically result in KVM skipping + * the instruction or injecting an exception, e.g. instruction and exception + * intercepts, and because pending exceptions have higher priority than pending + * interrupts, KVM still honors instruction boundaries in most scenarios. + * + * But, if a VM-Exit occurs during instruction execution, and KVM does NOT skip + * the instruction or inject an exception, then KVM can incorrecty inject a new + * asynchrounous event if the event became pending after the CPU fetched the + * instruction (in the guest). E.g. if a page fault (#PF, #NPF, EPT violation) + * occurs and is resolved by KVM, a coincident NMI, SMI, IRQ, etc... can be + * injected on the restarted instruction instead of being deferred until the + * instruction completes. + * + * In practice, this virtualization hole is unlikely to be observed by the + * guest, and even less likely to cause functional problems. To detect the + * hole, the guest would have to trigger an event on a side effect of an early + * phase of instruction execution, e.g. on the instruction fetch from memory. + * And for it to be a functional problem, the guest would need to depend on the + * ordering between that side effect, the instruction completing, _and_ the + * delivery of the asynchronous event. + */ +static int kvm_check_and_inject_events(struct kvm_vcpu *vcpu, + bool *req_immediate_exit) { bool can_inject; int r; @@ -10183,7 +10223,7 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu) * When APICv gets disabled, we may still have injected interrupts * pending. At the same time, KVM_REQ_EVENT may not be set as APICv was * still active when the interrupt got accepted. Make sure - * inject_pending_event() is called to check for that. + * kvm_check_and_inject_events() is called to check for that. */ if (!apic->apicv_active) kvm_make_request(KVM_REQ_EVENT, vcpu); @@ -10480,7 +10520,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) goto out; } - r = inject_pending_event(vcpu, &req_immediate_exit); + r = kvm_check_and_inject_events(vcpu, &req_immediate_exit); if (r < 0) { r = 0; goto out; From patchwork Sat Jul 23 00:51:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF285C43334 for ; Sat, 23 Jul 2022 00:54:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236904AbiGWAy3 (ORCPT ); Fri, 22 Jul 2022 20:54:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237039AbiGWAxV (ORCPT ); Fri, 22 Jul 2022 20:53:21 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1AE8BC1DF9 for ; Fri, 22 Jul 2022 17:52:32 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id u6-20020a25b7c6000000b00670862c5b16so4808021ybj.12 for ; Fri, 22 Jul 2022 17:52:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=iNidL5t3APu7KPx6x2AfmNcK6Gvmn9B++/Fwc0Sb5RU=; b=CpXe5BAYVMXS27JFgFQwxTrHUZ25kMD88IWo8FWkEHN1dllaPHhyYki1UJIVV/A43U NZPe5Ag886po3sZTKm35rsu8ILeZRiWvA80b3SS3R5S/vUCmGfQ1IuJSnAgJTSYw2VgU QjydX7sOS6kAd3TULu6wTWluz2ZOrx19/A1onVREHk//+UHeJOpzfn7B5CBQHWBgsqYF 4qirmGgP+bK8oO6ZfQ8bzzIF3KeHFnIBqdJ85i3CR/R0QYizLFj4p/bGGZp68vlBIssM xldRiQF9z8lx2IlHHsLz0XWJH/QudIm2GWn/RztEt68uaOGSgijPFSLTXr1vU52VfR/j npbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=iNidL5t3APu7KPx6x2AfmNcK6Gvmn9B++/Fwc0Sb5RU=; b=YSIhxOCtJI21iy2HIHgzizwqW90t6PkvXepaqGn9N0C02PNgZle61SCPHeB+b2oi12 LpbSsCMUcFnOxCe7WjHywKYfE2laLNO5XT9Ky/s/wg0PjdDhy87KkBJCYv3VQP7BQwpu kbxkkSCUj7gnVY62qRVdstLhJBrA8Xf8hVySwehGcsoZcQ/wuXw17x+OFRm2/Wk7HQjw IlgUySbg8wcRUmNrnKePB+8U5z/hgnVKHUGKM9YYMaiguuscEK+KEgEie3loTxaQd9Bj zM9PILkMZV4EA41axkmXpOgtcMYsbML4EVKuXRnLCMYutDDdcw4+wUdvRlHmCrvK2CTS f77w== X-Gm-Message-State: AJIora+EwBKskrCtmp8leOKvn8YFQKXkjIOEABKsDSR8mEIwbxWPtduv 0QMosP+z2xT72oLMA6Bg13Ri2LGVVWs= X-Google-Smtp-Source: AGRyM1vySDwE6Gvlodkq7FF1qcIENLkcMWrn40MsXgK66bhooD5J8AhouK2hcmn3uEoFR1HA3LtGtxEeE74= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:69c3:0:b0:31e:8481:21f5 with SMTP id e186-20020a8169c3000000b0031e848121f5mr2088956ywc.63.1658537538665; Fri, 22 Jul 2022 17:52:18 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:36 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-24-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 23/24] KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codes From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Include the vmx.h and svm.h uapi headers that KVM so kindly provides instead of manually defining all the same exit reasons/code. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- .../selftests/kvm/include/x86_64/svm_util.h | 7 +-- .../selftests/kvm/include/x86_64/vmx.h | 51 +------------------ 2 files changed, 4 insertions(+), 54 deletions(-) diff --git a/tools/testing/selftests/kvm/include/x86_64/svm_util.h b/tools/testing/selftests/kvm/include/x86_64/svm_util.h index a339b537a575..7aee6244ab6a 100644 --- a/tools/testing/selftests/kvm/include/x86_64/svm_util.h +++ b/tools/testing/selftests/kvm/include/x86_64/svm_util.h @@ -9,15 +9,12 @@ #ifndef SELFTEST_KVM_SVM_UTILS_H #define SELFTEST_KVM_SVM_UTILS_H +#include + #include #include "svm.h" #include "processor.h" -#define SVM_EXIT_EXCP_BASE 0x040 -#define SVM_EXIT_HLT 0x078 -#define SVM_EXIT_MSR 0x07c -#define SVM_EXIT_VMMCALL 0x081 - struct svm_test_data { /* VMCB */ struct vmcb *vmcb; /* gva */ diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h index 99fa1410964c..e4206f69b716 100644 --- a/tools/testing/selftests/kvm/include/x86_64/vmx.h +++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h @@ -8,6 +8,8 @@ #ifndef SELFTEST_KVM_VMX_H #define SELFTEST_KVM_VMX_H +#include + #include #include "processor.h" #include "apic.h" @@ -100,55 +102,6 @@ #define VMX_EPT_VPID_CAP_AD_BITS 0x00200000 #define EXIT_REASON_FAILED_VMENTRY 0x80000000 -#define EXIT_REASON_EXCEPTION_NMI 0 -#define EXIT_REASON_EXTERNAL_INTERRUPT 1 -#define EXIT_REASON_TRIPLE_FAULT 2 -#define EXIT_REASON_INTERRUPT_WINDOW 7 -#define EXIT_REASON_NMI_WINDOW 8 -#define EXIT_REASON_TASK_SWITCH 9 -#define EXIT_REASON_CPUID 10 -#define EXIT_REASON_HLT 12 -#define EXIT_REASON_INVD 13 -#define EXIT_REASON_INVLPG 14 -#define EXIT_REASON_RDPMC 15 -#define EXIT_REASON_RDTSC 16 -#define EXIT_REASON_VMCALL 18 -#define EXIT_REASON_VMCLEAR 19 -#define EXIT_REASON_VMLAUNCH 20 -#define EXIT_REASON_VMPTRLD 21 -#define EXIT_REASON_VMPTRST 22 -#define EXIT_REASON_VMREAD 23 -#define EXIT_REASON_VMRESUME 24 -#define EXIT_REASON_VMWRITE 25 -#define EXIT_REASON_VMOFF 26 -#define EXIT_REASON_VMON 27 -#define EXIT_REASON_CR_ACCESS 28 -#define EXIT_REASON_DR_ACCESS 29 -#define EXIT_REASON_IO_INSTRUCTION 30 -#define EXIT_REASON_MSR_READ 31 -#define EXIT_REASON_MSR_WRITE 32 -#define EXIT_REASON_INVALID_STATE 33 -#define EXIT_REASON_MWAIT_INSTRUCTION 36 -#define EXIT_REASON_MONITOR_INSTRUCTION 39 -#define EXIT_REASON_PAUSE_INSTRUCTION 40 -#define EXIT_REASON_MCE_DURING_VMENTRY 41 -#define EXIT_REASON_TPR_BELOW_THRESHOLD 43 -#define EXIT_REASON_APIC_ACCESS 44 -#define EXIT_REASON_EOI_INDUCED 45 -#define EXIT_REASON_EPT_VIOLATION 48 -#define EXIT_REASON_EPT_MISCONFIG 49 -#define EXIT_REASON_INVEPT 50 -#define EXIT_REASON_RDTSCP 51 -#define EXIT_REASON_PREEMPTION_TIMER 52 -#define EXIT_REASON_INVVPID 53 -#define EXIT_REASON_WBINVD 54 -#define EXIT_REASON_XSETBV 55 -#define EXIT_REASON_APIC_WRITE 56 -#define EXIT_REASON_INVPCID 58 -#define EXIT_REASON_PML_FULL 62 -#define EXIT_REASON_XSAVES 63 -#define EXIT_REASON_XRSTORS 64 -#define LAST_EXIT_REASON 64 enum vmcs_field { VIRTUAL_PROCESSOR_ID = 0x00000000, From patchwork Sat Jul 23 00:51:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12927026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7287C43334 for ; Sat, 23 Jul 2022 00:54:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236160AbiGWAyc (ORCPT ); Fri, 22 Jul 2022 20:54:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237061AbiGWAx0 (ORCPT ); Fri, 22 Jul 2022 20:53:26 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A3AC6C3801 for ; Fri, 22 Jul 2022 17:52:33 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id w2-20020a170902e88200b0016d47f39771so901782plg.4 for ; Fri, 22 Jul 2022 17:52:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=x3fBv6Xpvt2tm3X/TNtbKJCdyAgtgJOraCquwezjnks=; b=ONjI9x7VKOP8kXmSUDc/IyOMjoGexjQNg4aWwbsi0iyFELRJC0p4TOb7E9d6LmR8MC Z0oKBLeSdTc1+9bRKkEEt7GPKNfpcGxHP0jYM2yKz65b93IcxEcUVW66pRNsqhnSnqXZ ucKxPxdFhSJpnLJ8oLwngRlRO9Snp4jeKuR3d9lULeTQywGfPBG9E3mdQgzcS5KnMeZk 2NdlawuWHikXAf7z/OSlENg6eNPilaWQni1fc1LNZd/TQ147uhF0VsQxjx444FQoIA4e XgUK4Uk9gUC+8G1xi6R1S/Diuln0apB5tvCF5EpP5ysXiMZGQaDEBVwR6atlDSH3TbLU vq7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=x3fBv6Xpvt2tm3X/TNtbKJCdyAgtgJOraCquwezjnks=; b=YNzfhjmc3H6woPGMm5g499f4QkWtl7Q4kH/I3xgX7Bgjt9ynMB8ejommVph/53SCEN j8BVltbBu990ryN6yScibeSLrh4zHE6mqiejTa3qHuYXX0OsU0berWmTIzvN5uZBIGgP REZeCKg6C39mu1BzkeulrBzV5RLkoPpGS1h0Q1lxb4HKHTsFH10J3y8Dd9QX8GeGFGuh 3+uEGpf/724KuDUGm3fwHcdDKnQvdRaSX2QwU4ibgPOj4/Y8kZ38nOiaZ0PoK3OSHBqV 5hd9+CISfg3CLecJaA72BxxaiR3KvI5ZvrZD038rgEeKa/TfWsTTNKTSFD7AW88mD7sc Bw2Q== X-Gm-Message-State: AJIora/t85FL7TGVaKfC6uwq0Zwt7QwSMD5Pa1Zusfzow7kta4wPyHhc uB9Cz63lCQSS9f7gByTMAO4AZxDTEMA= X-Google-Smtp-Source: AGRyM1vm6sJKemTkEiYntMmW56V201SmpeKkIXMPMlgXZHTl1ZEUAKKWqA6+biYHZcoPnKZGHoyFOIQj7Yo= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:26f1:b0:52b:d0aa:4178 with SMTP id p49-20020a056a0026f100b0052bd0aa4178mr2530729pfw.86.1658537540338; Fri, 22 Jul 2022 17:52:20 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 00:51:37 +0000 In-Reply-To: <20220723005137.1649592-1-seanjc@google.com> Message-Id: <20220723005137.1649592-25-seanjc@google.com> Mime-Version: 1.0 References: <20220723005137.1649592-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v4 24/24] KVM: selftests: Add an x86-only test to verify nested exception queueing From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson , Maxim Levitsky , Oliver Upton , Peter Shier Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a test to verify that KVM_{G,S}ET_EVENTS play nice with pending vs. injected exceptions when an exception is being queued for L2, and that KVM correctly handles L1's exception intercept wants. Signed-off-by: Sean Christopherson Reviewed-by: Maxim Levitsky --- tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../kvm/x86_64/nested_exceptions_test.c | 295 ++++++++++++++++++ 3 files changed, 297 insertions(+) create mode 100644 tools/testing/selftests/kvm/x86_64/nested_exceptions_test.c diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore index d625a3f83780..45d9aee1c0d8 100644 --- a/tools/testing/selftests/kvm/.gitignore +++ b/tools/testing/selftests/kvm/.gitignore @@ -28,6 +28,7 @@ /x86_64/max_vcpuid_cap_test /x86_64/mmio_warning_test /x86_64/monitor_mwait_test +/x86_64/nested_exceptions_test /x86_64/nx_huge_pages_test /x86_64/platform_info_test /x86_64/pmu_event_filter_test diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 690b499c3471..05a34b6be26b 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -88,6 +88,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/kvm_clock_test TEST_GEN_PROGS_x86_64 += x86_64/kvm_pv_test TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test TEST_GEN_PROGS_x86_64 += x86_64/monitor_mwait_test +TEST_GEN_PROGS_x86_64 += x86_64/nested_exceptions_test TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test TEST_GEN_PROGS_x86_64 += x86_64/pmu_event_filter_test TEST_GEN_PROGS_x86_64 += x86_64/set_boot_cpu_id diff --git a/tools/testing/selftests/kvm/x86_64/nested_exceptions_test.c b/tools/testing/selftests/kvm/x86_64/nested_exceptions_test.c new file mode 100644 index 000000000000..ac33835f78f4 --- /dev/null +++ b/tools/testing/selftests/kvm/x86_64/nested_exceptions_test.c @@ -0,0 +1,295 @@ +// SPDX-License-Identifier: GPL-2.0-only +#define _GNU_SOURCE /* for program_invocation_short_name */ + +#include "test_util.h" +#include "kvm_util.h" +#include "processor.h" +#include "vmx.h" +#include "svm_util.h" + +#define L2_GUEST_STACK_SIZE 256 + +/* + * Arbitrary, never shoved into KVM/hardware, just need to avoid conflict with + * the "real" exceptions used, #SS/#GP/#DF (12/13/8). + */ +#define FAKE_TRIPLE_FAULT_VECTOR 0xaa + +/* Arbitrary 32-bit error code injected by this test. */ +#define SS_ERROR_CODE 0xdeadbeef + +/* + * Bit '0' is set on Intel if the exception occurs while delivering a previous + * event/exception. AMD's wording is ambiguous, but presumably the bit is set + * if the exception occurs while delivering an external event, e.g. NMI or INTR, + * but not for exceptions that occur when delivering other exceptions or + * software interrupts. + * + * Note, Intel's name for it, "External event", is misleading and much more + * aligned with AMD's behavior, but the SDM is quite clear on its behavior. + */ +#define ERROR_CODE_EXT_FLAG BIT(0) + +/* + * Bit '1' is set if the fault occurred when looking up a descriptor in the + * IDT, which is the case here as the IDT is empty/NULL. + */ +#define ERROR_CODE_IDT_FLAG BIT(1) + +/* + * The #GP that occurs when vectoring #SS should show the index into the IDT + * for #SS, plus have the "IDT flag" set. + */ +#define GP_ERROR_CODE_AMD ((SS_VECTOR * 8) | ERROR_CODE_IDT_FLAG) +#define GP_ERROR_CODE_INTEL ((SS_VECTOR * 8) | ERROR_CODE_IDT_FLAG | ERROR_CODE_EXT_FLAG) + +/* + * Intel and AMD both shove '0' into the error code on #DF, regardless of what + * led to the double fault. + */ +#define DF_ERROR_CODE 0 + +#define INTERCEPT_SS (BIT_ULL(SS_VECTOR)) +#define INTERCEPT_SS_DF (INTERCEPT_SS | BIT_ULL(DF_VECTOR)) +#define INTERCEPT_SS_GP_DF (INTERCEPT_SS_DF | BIT_ULL(GP_VECTOR)) + +static void l2_ss_pending_test(void) +{ + GUEST_SYNC(SS_VECTOR); +} + +static void l2_ss_injected_gp_test(void) +{ + GUEST_SYNC(GP_VECTOR); +} + +static void l2_ss_injected_df_test(void) +{ + GUEST_SYNC(DF_VECTOR); +} + +static void l2_ss_injected_tf_test(void) +{ + GUEST_SYNC(FAKE_TRIPLE_FAULT_VECTOR); +} + +static void svm_run_l2(struct svm_test_data *svm, void *l2_code, int vector, + uint32_t error_code) +{ + struct vmcb *vmcb = svm->vmcb; + struct vmcb_control_area *ctrl = &vmcb->control; + + vmcb->save.rip = (u64)l2_code; + run_guest(vmcb, svm->vmcb_gpa); + + if (vector == FAKE_TRIPLE_FAULT_VECTOR) + return; + + GUEST_ASSERT_EQ(ctrl->exit_code, (SVM_EXIT_EXCP_BASE + vector)); + GUEST_ASSERT_EQ(ctrl->exit_info_1, error_code); +} + +static void l1_svm_code(struct svm_test_data *svm) +{ + struct vmcb_control_area *ctrl = &svm->vmcb->control; + unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE]; + + generic_svm_setup(svm, NULL, &l2_guest_stack[L2_GUEST_STACK_SIZE]); + svm->vmcb->save.idtr.limit = 0; + ctrl->intercept |= BIT_ULL(INTERCEPT_SHUTDOWN); + + ctrl->intercept_exceptions = INTERCEPT_SS_GP_DF; + svm_run_l2(svm, l2_ss_pending_test, SS_VECTOR, SS_ERROR_CODE); + svm_run_l2(svm, l2_ss_injected_gp_test, GP_VECTOR, GP_ERROR_CODE_AMD); + + ctrl->intercept_exceptions = INTERCEPT_SS_DF; + svm_run_l2(svm, l2_ss_injected_df_test, DF_VECTOR, DF_ERROR_CODE); + + ctrl->intercept_exceptions = INTERCEPT_SS; + svm_run_l2(svm, l2_ss_injected_tf_test, FAKE_TRIPLE_FAULT_VECTOR, 0); + GUEST_ASSERT_EQ(ctrl->exit_code, SVM_EXIT_SHUTDOWN); + + GUEST_DONE(); +} + +static void vmx_run_l2(void *l2_code, int vector, uint32_t error_code) +{ + GUEST_ASSERT(!vmwrite(GUEST_RIP, (u64)l2_code)); + + GUEST_ASSERT_EQ(vector == SS_VECTOR ? vmlaunch() : vmresume(), 0); + + if (vector == FAKE_TRIPLE_FAULT_VECTOR) + return; + + GUEST_ASSERT_EQ(vmreadz(VM_EXIT_REASON), EXIT_REASON_EXCEPTION_NMI); + GUEST_ASSERT_EQ((vmreadz(VM_EXIT_INTR_INFO) & 0xff), vector); + GUEST_ASSERT_EQ(vmreadz(VM_EXIT_INTR_ERROR_CODE), error_code); +} + +static void l1_vmx_code(struct vmx_pages *vmx) +{ + unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE]; + + GUEST_ASSERT_EQ(prepare_for_vmx_operation(vmx), true); + + GUEST_ASSERT_EQ(load_vmcs(vmx), true); + + prepare_vmcs(vmx, NULL, &l2_guest_stack[L2_GUEST_STACK_SIZE]); + GUEST_ASSERT_EQ(vmwrite(GUEST_IDTR_LIMIT, 0), 0); + + /* + * VMX disallows injecting an exception with error_code[31:16] != 0, + * and hardware will never generate a VM-Exit with bits 31:16 set. + * KVM should likewise truncate the "bad" userspace value. + */ + GUEST_ASSERT_EQ(vmwrite(EXCEPTION_BITMAP, INTERCEPT_SS_GP_DF), 0); + vmx_run_l2(l2_ss_pending_test, SS_VECTOR, (u16)SS_ERROR_CODE); + vmx_run_l2(l2_ss_injected_gp_test, GP_VECTOR, GP_ERROR_CODE_INTEL); + + GUEST_ASSERT_EQ(vmwrite(EXCEPTION_BITMAP, INTERCEPT_SS_DF), 0); + vmx_run_l2(l2_ss_injected_df_test, DF_VECTOR, DF_ERROR_CODE); + + GUEST_ASSERT_EQ(vmwrite(EXCEPTION_BITMAP, INTERCEPT_SS), 0); + vmx_run_l2(l2_ss_injected_tf_test, FAKE_TRIPLE_FAULT_VECTOR, 0); + GUEST_ASSERT_EQ(vmreadz(VM_EXIT_REASON), EXIT_REASON_TRIPLE_FAULT); + + GUEST_DONE(); +} + +static void __attribute__((__flatten__)) l1_guest_code(void *test_data) +{ + if (this_cpu_has(X86_FEATURE_SVM)) + l1_svm_code(test_data); + else + l1_vmx_code(test_data); +} + +static void assert_ucall_vector(struct kvm_vcpu *vcpu, int vector) +{ + struct kvm_run *run = vcpu->run; + struct ucall uc; + + TEST_ASSERT(run->exit_reason == KVM_EXIT_IO, + "Unexpected exit reason: %u (%s),\n", + run->exit_reason, exit_reason_str(run->exit_reason)); + + switch (get_ucall(vcpu, &uc)) { + case UCALL_SYNC: + TEST_ASSERT(vector == uc.args[1], + "Expected L2 to ask for %d, got %ld", vector, uc.args[1]); + break; + case UCALL_DONE: + TEST_ASSERT(vector == -1, + "Expected L2 to ask for %d, L2 says it's done", vector); + break; + case UCALL_ABORT: + TEST_FAIL("%s at %s:%ld (0x%lx != 0x%lx)", + (const char *)uc.args[0], __FILE__, uc.args[1], + uc.args[2], uc.args[3]); + break; + default: + TEST_FAIL("Expected L2 to ask for %d, got unexpected ucall %lu", vector, uc.cmd); + } +} + +static void queue_ss_exception(struct kvm_vcpu *vcpu, bool inject) +{ + struct kvm_vcpu_events events; + + vcpu_events_get(vcpu, &events); + + TEST_ASSERT(!events.exception.pending, + "Vector %d unexpectedlt pending", events.exception.nr); + TEST_ASSERT(!events.exception.injected, + "Vector %d unexpectedly injected", events.exception.nr); + + events.flags = KVM_VCPUEVENT_VALID_PAYLOAD; + events.exception.pending = !inject; + events.exception.injected = inject; + events.exception.nr = SS_VECTOR; + events.exception.has_error_code = true; + events.exception.error_code = SS_ERROR_CODE; + vcpu_events_set(vcpu, &events); +} + +/* + * Verify KVM_{G,S}ET_EVENTS play nice with pending vs. injected exceptions + * when an exception is being queued for L2. Specifically, verify that KVM + * honors L1 exception intercept controls when a #SS is pending/injected, + * triggers a #GP on vectoring the #SS, morphs to #DF if #GP isn't intercepted + * by L1, and finally causes (nested) SHUTDOWN if #DF isn't intercepted by L1. + */ +int main(int argc, char *argv[]) +{ + vm_vaddr_t nested_test_data_gva; + struct kvm_vcpu_events events; + struct kvm_vcpu *vcpu; + struct kvm_vm *vm; + + TEST_REQUIRE(kvm_has_cap(KVM_CAP_EXCEPTION_PAYLOAD)); + TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_SVM) || kvm_cpu_has(X86_FEATURE_VMX)); + + vm = vm_create_with_one_vcpu(&vcpu, l1_guest_code); + vm_enable_cap(vm, KVM_CAP_EXCEPTION_PAYLOAD, -2ul); + + if (kvm_cpu_has(X86_FEATURE_SVM)) + vcpu_alloc_svm(vm, &nested_test_data_gva); + else + vcpu_alloc_vmx(vm, &nested_test_data_gva); + + vcpu_args_set(vcpu, 1, nested_test_data_gva); + + /* Run L1 => L2. L2 should sync and request #SS. */ + vcpu_run(vcpu); + assert_ucall_vector(vcpu, SS_VECTOR); + + /* Pend #SS and request immediate exit. #SS should still be pending. */ + queue_ss_exception(vcpu, false); + vcpu->run->immediate_exit = true; + vcpu_run_complete_io(vcpu); + + /* Verify the pending events comes back out the same as it went in. */ + vcpu_events_get(vcpu, &events); + ASSERT_EQ(events.flags & KVM_VCPUEVENT_VALID_PAYLOAD, + KVM_VCPUEVENT_VALID_PAYLOAD); + ASSERT_EQ(events.exception.pending, true); + ASSERT_EQ(events.exception.nr, SS_VECTOR); + ASSERT_EQ(events.exception.has_error_code, true); + ASSERT_EQ(events.exception.error_code, SS_ERROR_CODE); + + /* + * Run for real with the pending #SS, L1 should get a VM-Exit due to + * #SS interception and re-enter L2 to request #GP (via injected #SS). + */ + vcpu->run->immediate_exit = false; + vcpu_run(vcpu); + assert_ucall_vector(vcpu, GP_VECTOR); + + /* + * Inject #SS, the #SS should bypass interception and cause #GP, which + * L1 should intercept before KVM morphs it to #DF. L1 should then + * disable #GP interception and run L2 to request #DF (via #SS => #GP). + */ + queue_ss_exception(vcpu, true); + vcpu_run(vcpu); + assert_ucall_vector(vcpu, DF_VECTOR); + + /* + * Inject #SS, the #SS should bypass interception and cause #GP, which + * L1 is no longer interception, and so should see a #DF VM-Exit. L1 + * should then signal that is done. + */ + queue_ss_exception(vcpu, true); + vcpu_run(vcpu); + assert_ucall_vector(vcpu, FAKE_TRIPLE_FAULT_VECTOR); + + /* + * Inject #SS yet again. L1 is not intercepting #GP or #DF, and so + * should see nested TRIPLE_FAULT / SHUTDOWN. + */ + queue_ss_exception(vcpu, true); + vcpu_run(vcpu); + assert_ucall_vector(vcpu, -1); + + kvm_vm_free(vm); +}