From patchwork Wed Dec 16 12:28:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yanan Wang X-Patchwork-Id: 11977457 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5705BC4361B for ; Wed, 16 Dec 2020 12:31:34 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A521D2311B for ; Wed, 16 Dec 2020 12:31:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A521D2311B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=LPmHciJM1WrriE9JA9ViNoIX/fYnLvZDmpaIx6bdtYc=; b=yo9XDfhqekMSoULz+9bdZqfId NcKoPDBGS/cytcDcTy5C+6tESrMCNrcM+Te/ucnjFqc5huwtj1FsVZTsv8ylRKL2DfdGsz0D1rDRC 717Y3RmC14miMS0vACYlihNZfMprzEa3vo/XBjLTVpH3wjQVz3dc+UFpGgvpMmxOqmN8aNJ9QgZ9O kWhkTljN8qAFnrhzr1QP1v96wdSdJehGrp9+skczJns/Yqui9e0/h1BQxyU0egvNq2tvAcr/5T8wT TZkZeKmZ6ZI7GOaka7yn1uJqWjdalTSGLnHmiun6bOiw2yGAkquJmmJTVIhMMPVylhMgUyoblVcLL kqanBt1Tg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kpVw2-0004zV-Cl; Wed, 16 Dec 2020 12:29:39 +0000 Received: from casper.infradead.org ([2001:8b0:10b:1236::1]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kpVvi-0004sV-Im for linux-arm-kernel@merlin.infradead.org; Wed, 16 Dec 2020 12:29:19 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:CC:To:From:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=yLH6D+5gnCGPJU0sfd4bdvwotzfDumJycoDZogCXT2g=; b=uaIzAKf0gOH/0dgzCzj0HQNnQj e/EsgF0vtu3Td6zKHyOtgZq4mk0xTGH/BYG6yfWzcyCmmIDWfxvSC+SMlRQKyfPDsgQhpTiac5fNi +SIt9Qq/YBtsgnH0tYCeSTlgo4Ooq2GZBabSpyEc+kwvK1TIVYEehAomC0Ijy9sbNhzMOh5TQElBI hvliG4VE9WxsNbg5JBXCJAn669Lap8N19FsQ4g5v2mP4nAsx70WjkNtXLkr/4fkIv/KUr5Bi4PJpw G2B3S2o1y3Qd/ao1E2KVHfRIDnjGVLLsSiKxnYlSb+/wg1hAZkO3FxYcvZ0W3MD9qxwrL2ZfTWlk7 spA3aqIA==; Received: from szxga04-in.huawei.com ([45.249.212.190]) by casper.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kpVvd-00031a-0n for linux-arm-kernel@lists.infradead.org; Wed, 16 Dec 2020 12:29:17 +0000 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4CwvYK0kxmz15cyD; Wed, 16 Dec 2020 20:28:21 +0800 (CST) Received: from DESKTOP-TMVL5KK.china.huawei.com (10.174.187.128) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.498.0; Wed, 16 Dec 2020 20:28:50 +0800 From: Yanan Wang To: , , Marc Zyngier , Catalin Marinas , Will Deacon , James Morse , "Julien Thierry" , Suzuki K Poulose , Gavin Shan , Quentin Perret Subject: [PATCH v2 2/3] KVM: arm64: Add prejudgement for relaxing permissions only case in stage2 translation fault handler Date: Wed, 16 Dec 2020 20:28:43 +0800 Message-ID: <20201216122844.25092-3-wangyanan55@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20201216122844.25092-1-wangyanan55@huawei.com> References: <20201216122844.25092-1-wangyanan55@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.187.128] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201216_122913_590756_04CD2D54 X-CRM114-Status: GOOD ( 19.85 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: yuzenghui@huawei.com, wanghaibin.wang@huawei.com, Yanan Wang , zhukeqian1@huawei.com, yezengruan@huawei.com Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org In dirty-logging, or dirty-logging-stopped time, even normal running time of a guest configed with huge mappings and numbers of vCPUs, translation faults by different vCPUs on the same GPA could occur successively almost at the same time. There are two reasons for it. (1) If there are some vCPUs accessing the same GPA at the same time and the leaf PTE is not set yet, then they will all cause translation faults and the first vCPU holding mmu_lock will set valid leaf PTE, and the others will later update the old PTE with a new one if they are different. (2) When changing a leaf entry or a table entry with break-before-make, if there are some vCPUs accessing the same GPA just catch the moment when the target PTE is set invalid in a BBM procedure coincidentally, they will all cause translation faults and will later update the old PTE with a new one if they are different. The worst case can be like this: vCPU A causes a translation fault with RW prot and sets the leaf PTE with RW permissions, and then the next vCPU B with RO prot updates the PTE back to RO permissions with break-before-make. And the BBM-invalid moment may trigger more unnecessary translation faults, then some useless small loops might occur which could lead to vCPU stuck. To avoid unnecessary update and small loops, add prejudgement in the translation fault handler: Skip updating the PTE with break-before-make if we are trying to recreate the exact same mapping or only change the access permissions. Actually, change of permissions will be handled through the relax_perms path next time if necessary. Signed-off-by: Yanan Wang --- arch/arm64/kvm/hyp/pgtable.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 350f9f810930..8225ced49bad 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -45,6 +45,10 @@ #define KVM_PTE_LEAF_ATTR_HI_S2_XN BIT(54) +#define KVM_PTE_LEAF_ATTR_S2_PERMS (KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | \ + KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \ + KVM_PTE_LEAF_ATTR_HI_S2_XN) + struct kvm_pgtable_walk_data { struct kvm_pgtable *pgt; struct kvm_pgtable_walker *walker; @@ -460,7 +464,7 @@ static int stage2_map_set_prot_attr(enum kvm_pgtable_prot prot, return 0; } -static bool stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level, +static int stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, struct stage2_map_data *data) { @@ -469,13 +473,18 @@ static bool stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level, struct page *page = virt_to_page(ptep); if (!kvm_block_mapping_supported(addr, end, phys, level)) - return false; + return 1; new = kvm_init_valid_leaf_pte(phys, data->attr, level); if (kvm_pte_valid(old)) { - /* Tolerate KVM recreating the exact same mapping */ - if (old == new) - goto out; + /* + * Skip updating the PTE with break-before-make if we are trying + * to recreate the exact same mapping or only change the access + * permissions. Actually, change of permissions will be handled + * through the relax_perms path next time if necessary. + */ + if (!((old ^ new) & (~KVM_PTE_LEAF_ATTR_S2_PERMS))) + return -EAGAIN; /* There's an existing different valid leaf entry, so perform * break-before-make. @@ -487,9 +496,8 @@ static bool stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level, smp_store_release(ptep, new); get_page(page); -out: data->phys += granule; - return true; + return 0; } static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 level, @@ -517,6 +525,7 @@ static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 level, static int stage2_map_walk_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, struct stage2_map_data *data) { + int ret; kvm_pte_t *childp, pte = *ptep; struct page *page = virt_to_page(ptep); @@ -527,8 +536,9 @@ static int stage2_map_walk_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, return 0; } - if (stage2_map_walker_try_leaf(addr, end, level, ptep, data)) - return 0; + ret = stage2_map_walker_try_leaf(addr, end, level, ptep, data); + if (ret <= 0) + return ret; if (WARN_ON(level == KVM_PGTABLE_MAX_LEVELS - 1)) return -EINVAL;