From patchwork Mon Dec 3 23:17:11 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiao Guangrong X-Patchwork-Id: 1835591 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id 45CB7DF5B1 for ; Mon, 3 Dec 2012 23:17:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751758Ab2LCXRY (ORCPT ); Mon, 3 Dec 2012 18:17:24 -0500 Received: from e28smtp02.in.ibm.com ([122.248.162.2]:43992 "EHLO e28smtp02.in.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751705Ab2LCXRV (ORCPT ); Mon, 3 Dec 2012 18:17:21 -0500 Received: from /spool/local by e28smtp02.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 4 Dec 2012 04:47:09 +0530 Received: from d28dlp02.in.ibm.com (9.184.220.127) by e28smtp02.in.ibm.com (192.168.1.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 4 Dec 2012 04:47:07 +0530 Received: from d28relay04.in.ibm.com (d28relay04.in.ibm.com [9.184.220.61]) by d28dlp02.in.ibm.com (Postfix) with ESMTP id 2BF39394004B; Tue, 4 Dec 2012 04:47:15 +0530 (IST) Received: from d28av05.in.ibm.com (d28av05.in.ibm.com [9.184.220.67]) by d28relay04.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qB3NHELY7012822; Tue, 4 Dec 2012 04:47:14 +0530 Received: from d28av05.in.ibm.com (loopback [127.0.0.1]) by d28av05.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qB44ksgp027609; Tue, 4 Dec 2012 15:46:54 +1100 Received: from localhost.localdomain ([9.125.29.183]) by d28av05.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id qB44kqe8027450; Tue, 4 Dec 2012 15:46:52 +1100 Message-ID: <50BD32F7.5080601@linux.vnet.ibm.com> Date: Tue, 04 Dec 2012 07:17:11 +0800 From: Xiao Guangrong User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120911 Thunderbird/15.0.1 MIME-Version: 1.0 To: Avi Kivity CC: Marcelo Tosatti , Gleb Natapov , LKML , KVM Subject: [PATCH] KVM: MMU: optimize for set_spte X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12120323-5816-0000-0000-000005AA925E Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org There are two cases we need to adjust page size in set_spte: 1): the one is other vcpu creates new sp in the window between mapping_level() and acquiring mmu-lock. 2): the another case is the new sp is created by itself (page-fault path) when guest uses the target gfn as its page table. In current code, set_spte drop the spte and emulate the access for these case, it works not good: - for the case 1, it may destroy the mapping established by other vcpu, and do expensive instruction emulation. - for the case 2, it may emulate the access even if the guest is accessing the page which not used as page table. There is a example, 0~2M is used as huge page in guest, in this huge page, only page 3 used as page table, then guest read/writes on other pages can cause instruction emulation. Both of these cases can be fixed by allowing guest to retry the access, it will refault, then we can establish the mapping by using small page Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 16 ++++++++++++---- 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index b875a9e..01d7c2a 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -2382,12 +2382,20 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, || (!vcpu->arch.mmu.direct_map && write_fault && !is_write_protection(vcpu) && !user_fault)) { + /* + * There are two cases: + * - the one is other vcpu creates new sp in the window + * between mapping_level() and acquiring mmu-lock. + * - the another case is the new sp is created by itself + * (page-fault path) when guest uses the target gfn as + * its page table. + * Both of these cases can be fixed by allowing guest to + * retry the access, it will refault, then we can establish + * the mapping by using small page. + */ if (level > PT_PAGE_TABLE_LEVEL && - has_wrprotected_page(vcpu->kvm, gfn, level)) { - ret = 1; - drop_spte(vcpu->kvm, sptep); + has_wrprotected_page(vcpu->kvm, gfn, level)) goto done; - } spte |= PT_WRITABLE_MASK | SPTE_MMU_WRITEABLE;