From patchwork Tue Aug 6 15:48:54 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Kiszka X-Patchwork-Id: 2839447 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 395E7BF535 for ; Tue, 6 Aug 2013 15:49:17 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D63B3201C4 for ; Tue, 6 Aug 2013 15:49:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 08884201BA for ; Tue, 6 Aug 2013 15:49:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755998Ab3HFPtI (ORCPT ); Tue, 6 Aug 2013 11:49:08 -0400 Received: from david.siemens.de ([192.35.17.14]:22181 "EHLO david.siemens.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755113Ab3HFPtH (ORCPT ); Tue, 6 Aug 2013 11:49:07 -0400 Received: from mail1.siemens.de (localhost [127.0.0.1]) by david.siemens.de (8.13.6/8.13.6) with ESMTP id r76Fmt50024632; Tue, 6 Aug 2013 17:48:55 +0200 Received: from mchn199C.mchp.siemens.de ([139.25.40.156]) by mail1.siemens.de (8.13.6/8.13.6) with ESMTP id r76FmsNc006446; Tue, 6 Aug 2013 17:48:55 +0200 Message-ID: <52011AE6.2010006@siemens.com> Date: Tue, 06 Aug 2013 17:48:54 +0200 From: Jan Kiszka User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666 MIME-Version: 1.0 To: "Zhang, Yang Z" CC: Gleb Natapov , Paolo Bonzini , kvm , Xiao Guangrong , "Nakajima, Jun" , Arthur Chunqi Li Subject: Re: [PATCH v2 5/8] KVM: nVMX: Fix guest CR3 read-back on VM-exit References: <0816baee846f9c8f4d54c6738b2582a95f9c56a3.1375778397.git.jan.kiszka@web.de> <20130806101236.GN8218@redhat.com> <20130806140248.GB8218@redhat.com> <20130806144117.GD8218@redhat.com> In-Reply-To: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 2013-08-06 17:04, Zhang, Yang Z wrote: > Gleb Natapov wrote on 2013-08-06: >> On Tue, Aug 06, 2013 at 02:12:51PM +0000, Zhang, Yang Z wrote: >>> Gleb Natapov wrote on 2013-08-06: >>>> On Tue, Aug 06, 2013 at 11:44:41AM +0000, Zhang, Yang Z wrote: >>>>> Gleb Natapov wrote on 2013-08-06: >>>>>> On Tue, Aug 06, 2013 at 10:39:59AM +0200, Jan Kiszka wrote: >>>>>>> From: Jan Kiszka >>>>>>> >>>>>>> If nested EPT is enabled, the L2 guest may change CR3 without any >>>>>>> exits. We therefore have to read the current value from the VMCS >>>>>>> when switching to L1. However, if paging wasn't enabled, L0 tracks >>>>>>> L2's CR3, and GUEST_CR3 rather contains the real-mode identity map. >>>>>>> So we need to retrieve CR3 from the architectural state after >>>>>>> conditionally updating it - and this is what kvm_read_cr3 does. >>>>>>> >>>>>> I have a headache from trying to think about it already, but >>>>>> shouldn't >>>>>> L1 be the one who setups identity map for L2? I traced what >>>>>> vmcs_read64(GUEST_CR3)/kvm_read_cr3(vcpu) return here and do not >>>>>> see >>>>> Here is my understanding: >>>>> In vmx_set_cr3(), if enabled ept, it will check whether target >>>>> vcpu is enabling >>>> paging. When L2 running in real mode, then target vcpu is not >>>> enabling paging and it will use L0's identity map for L2. If you >>>> read GUEST_CR3 from VMCS, then you may get the L2's identity map >>>> not >> L1's. >>>>> >>>> Yes, but why it makes sense to use L0 identity map for L2? I didn't >>>> see different vmcs_read64(GUEST_CR3)/kvm_read_cr3(vcpu) values because >>>> L0 and L1 use the same identity map address. When I changed identity >>>> address L1 configures vmcs_read64(GUEST_CR3)/kvm_read_cr3(vcpu) are >>>> indeed different, but the real CR3 L2 uses points to L0 identity map. >>>> If I zero L1 identity map page L2 still works. >>>> >>> If L2 in real mode, then L2PA == L1PA. So L0's identity map also works >>> if L2 is in real mode. >>> >> That not the point. It may work accidentally for kvm on kvm, but what >> if other hypervisor plays different tricks and builds different ident map for its guest? > Yes, if other hypervisor doesn't build the 1:1 mapping for its guest, it will fail to work. But I cannot imagine what kind of hypervisor will do this and what the purpose is. > Anyway, current logic is definitely wrong. It should use L1's identity map instead L0's. So something like this is rather needed? Jan diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 44494ed..60a3644 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3375,8 +3375,10 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) if (enable_ept) { eptp = construct_eptp(cr3); vmcs_write64(EPT_POINTER, eptp); - guest_cr3 = is_paging(vcpu) ? kvm_read_cr3(vcpu) : - vcpu->kvm->arch.ept_identity_map_addr; + if (is_paging(vcpu) || is_guest_mode(vcpu)) + guest_cr3 = kvm_read_cr3(vcpu) : + else + guest_cr3 = vcpu->kvm->arch.ept_identity_map_addr; ept_load_pdptrs(vcpu); }