From patchwork Mon Aug 30 12:55:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12465187 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3E69C432BE for ; Mon, 30 Aug 2021 12:55:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A392360FE8 for ; Mon, 30 Aug 2021 12:55:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236743AbhH3M4n (ORCPT ); Mon, 30 Aug 2021 08:56:43 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:33772 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236671AbhH3M4m (ORCPT ); Mon, 30 Aug 2021 08:56:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630328148; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=O/adh2CL/0uFAWeanQAJZ3tgc0mK1zIK8d1KVfT8pDc=; b=Hl99BTgZD07t507OJC4nEmL3Xls5RHpGzJq6aUKWGWJEc0Aafph8hfMDkq6FrWMHuAivEX j5mAGzavrcPPA49IAnvFaupfPPA05gxr1WNZaC9A9wd5UdGBNDFo8qs/ai0PAWexO3XIkd 3dyQ7DjsMmVdj8nyI9vRU3bofkNwvBM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-141-TFiFWKQXMnSOE5BdrS_Mug-1; Mon, 30 Aug 2021 08:55:47 -0400 X-MC-Unique: TFiFWKQXMnSOE5BdrS_Mug-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 699E68042E0; Mon, 30 Aug 2021 12:55:44 +0000 (UTC) Received: from localhost.localdomain (unknown [10.35.206.50]) by smtp.corp.redhat.com (Postfix) with ESMTP id DC53860854; Mon, 30 Aug 2021 12:55:40 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Jim Mattson , Sean Christopherson , Paolo Bonzini , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), Ingo Molnar , Vitaly Kuznetsov , Thomas Gleixner , Maxim Levitsky , linux-kernel@vger.kernel.org (open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)), Joerg Roedel , Wanpeng Li , "H. Peter Anvin" , Borislav Petkov Subject: [PATCH v2 0/6] KVM: few more SMM fixes Date: Mon, 30 Aug 2021 15:55:33 +0300 Message-Id: <20210830125539.1768833-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org These are few SMM fixes I was working on last week. (V2: I merged here VMX's SMM fixes, while SVM SMM fixes are unchanged from V1) * Patch 1 fixes a minor issue that remained after commit 37be407b2ce8 ("KVM: nSVM: Fix L1 state corruption upon return from SMM") While now, returns to guest mode from SMM work due to restored state from HSAVE area, the guest entry still sees incorrect HSAVE state. This for example breaks return from SMM when the guest is 32 bit, due to PDPTRs loading which are done using incorrect MMU state which is incorrect, because it was setup with incorrect L1 HSAVE state. * Patch 2 fixes a theoretical issue that I introduced with my SREGS2 patchset, which Sean Christopherson pointed out. The issue is that KVM_REQ_GET_NESTED_STATE_PAGES request is not only used for completing the load of the nested state, but it is also used to complete exit from SMM to guest mode, and my compatibility hack of pdptrs_from_userspace was done assuming that this is not done. While it is safe to just reset 'pdptrs_from_userspace' on each VM entry, I don't want to slow down the common code for this very rare hack. Instead I explicitly zero this variable when SMM exit to guest mode is done, because in this case PDPTRs do need to be reloaded from memory always. Note that this is a theoretical issue only, because after 'vendor' return from smm code (aka .leave_smm) is done, even when it returned to the guest mode, which loads some of L2 CPU state, we still load again all of the L2 cpu state captured in SMRAM which includes CR3, at which point guest PDPTRs are re-loaded anyway. Also note that across SMI entries the CR3 seems not to be updated, and Intel's SDM notes that it saved value in SMRAM isn't writable, thus it is possible that if SMM handler didn't change CR3, the pdptrs would not be touched. I guess that means that a SMI handler can in theory preserve PDPTRs by never touching CR3, but since recently we removed that code that didn't update PDPTRs if CR3 didn't change, I guess it won't work. Anyway I don't think any OS bothers to have PDPTRs not synced with whatever page CR3 points at, thus I didn't bother to try and test what the real hardware does in this case. * 3rd patch makes SVM SMM exit to be a bit more similar to how VMX does it by also raising KVM_REQ_GET_NESTED_STATE_PAGES requests. I do have doubts about why we need to do this on VMX though. The initial justification for this comes from 7f7f1ba33cf2 ("KVM: x86: do not load vmcs12 pages while still in SMM") With all the MMU changes, I am not sure that we can still have a case of not up to date MMU when we enter the nested guest from SMM. On SVM it does seem to work anyway without this. * Patch 3 fixes guest emulation failure when unrestricted_guest=0 and we reach handle_exception_nmi_irqoff. That function takes stale values from current vmcs and fails not taking into account the fact that we are emulating invalid guest state now, and thus no VM exit happened. * Patch 4 fixed a corner case where return from SMM is slightly corrupting the L2 segment register state when unrestricted_guest=0 due to real mode segement caching register logic, but later it restores it correctly from SMMRAM. Fix this by not failing nested_vmx_enter_non_root_mode and delaying this failure to the next nested VM entry. * Patch 5 fixes another corner case where emulation_required was not updated correctly on nested VMexit when restoring the L1 segement registers. I still track 2 SMM issues: 1. When HyperV guest is running nested, and uses SMM enabled OVMF, it crashes and reboots during the boot process. 2. Nested migration on VMX is still broken when L1 floods itself with SMIs. Best regards, Maxim Levitsky Maxim Levitsky (6): KVM: SVM: restore the L1 host state prior to resuming nested guest on SMM exit KVM: x86: force PDPTR reload on SMM exit KVM: nSVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode KVM: VMX: synthesize invalid VM exit when emulating invalid guest state KVM: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry KVM: nVMX: re-evaluate emulation_required on nested VM exit arch/x86/kvm/svm/nested.c | 10 +++++++--- arch/x86/kvm/svm/svm.c | 27 ++++++++++++++++++--------- arch/x86/kvm/svm/svm.h | 3 ++- arch/x86/kvm/vmx/nested.c | 9 ++++++++- arch/x86/kvm/vmx/vmx.c | 35 ++++++++++++++++++++++++++++------- arch/x86/kvm/vmx/vmx.h | 1 + 6 files changed, 64 insertions(+), 21 deletions(-)