From patchwork Sat Nov 12 08:17:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13041031 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4C7BC43219 for ; Sat, 12 Nov 2022 08:17:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234774AbiKLIRX (ORCPT ); Sat, 12 Nov 2022 03:17:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234629AbiKLIRU (ORCPT ); Sat, 12 Nov 2022 03:17:20 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18A3213EAE for ; Sat, 12 Nov 2022 00:17:20 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id y6-20020a25b9c6000000b006c1c6161716so6316089ybj.8 for ; Sat, 12 Nov 2022 00:17:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=bR8iaUgvP1KhoHZDz60Ag9zfSgL9+5xkoUuLuzW9M/E=; b=pIfn545Xg4RILlmoV71GI5wUoqZTMmdLuKi3J1MKvNujIUDEvJocA2w/+4VdHLrydh sK0Kz45DrHxyinoIeLQ4cXcCDHNvB/fiegMkUkCZ/jootdi1olpyg74PID7iMbDKYQR3 oTZOhZlJUcV6lDPT4LQuQKTAkUOdnm8NMJhFxby6atuXio/kFt9Ro18NJkwc66I1I2gD WtsbfVbkSfXyC7tjTyG3DVsO9TzFV+X60s4LMAxkSYCQPawuBtmOiZYDkpv12smEP4OZ 34OKYJg7ZObKGYB0um41oXBgQvIAn4s1YHbR2atKJANaTgagyZ5lL6q2a92cdPP49H8g pC2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=bR8iaUgvP1KhoHZDz60Ag9zfSgL9+5xkoUuLuzW9M/E=; b=HuV1iQdHK91ujH0tA4Tf1fLcavyxgY45GRL4aA0/q3RamLsab6/C/hDxoQZ5sOHPbp MPNqNFKskrUyE2Bw7guDsGZzm7/wecnTDrnQ8gIjQipNV8uXO5+z0HKmXwaV923Qyleg HSn7nkpuhJVQK72tUO6iZlmDlFthsBBoNEDVGXchLiH1o0+7GebQRsma5vvIIKoZE+Lm Z4NCYi244FSX6RdOVWs4uNRedVBIuUTVtI6zcDWo3Va45ez7LsMsIAMmV4Uu1LASCstn eUwlth9SqD9zz2/DbliyjFJI0X8vPH4gjeuwXMatLoZrXhhK3vNU2lc31V9WUpgsDko8 M6Nw== X-Gm-Message-State: ACrzQf0pYVbLWpH6lmeixEz56rwmVoU/D2lqel9uAVIdqUzzN7+5+iP3 AINqSDyAwQK5nWUbw6+OlvpEePs1QzuDJA== X-Google-Smtp-Source: AMsMyM4ah6NGC8kmhMxLtTth8Jo+WhdvvMmY7TiDG/C+ZbL8Crz+PCFuUZuBbeleghI7+CU1WJykCKDc7+AiiA== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a81:6554:0:b0:349:8e3:a882 with SMTP id z81-20020a816554000000b0034908e3a882mr65030760ywb.388.1668241038874; Sat, 12 Nov 2022 00:17:18 -0800 (PST) Date: Sat, 12 Nov 2022 08:17:03 +0000 In-Reply-To: <20221112081714.2169495-1-ricarkol@google.com> Mime-Version: 1.0 References: <20221112081714.2169495-1-ricarkol@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221112081714.2169495-2-ricarkol@google.com> Subject: [RFC PATCH 01/12] KVM: arm64: Relax WARN check in stage2_make_pte() From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, dmatlack@google.com, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org stage2_make_pte() throws a warning when used in a non-shared walk, as PTEs are not "locked" when walking non-shared. Add a check so it can be used non-shared. Signed-off-by: Ricardo Koller --- arch/arm64/kvm/hyp/pgtable.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index c12462439e70..b16107bf917c 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -733,7 +733,8 @@ static void stage2_make_pte(const struct kvm_pgtable_visit_ctx *ctx, kvm_pte_t n { struct kvm_pgtable_mm_ops *mm_ops = ctx->mm_ops; - WARN_ON(!stage2_pte_is_locked(*ctx->ptep)); + if (kvm_pgtable_walk_shared(ctx)) + WARN_ON(!stage2_pte_is_locked(*ctx->ptep)); if (stage2_pte_is_counted(new)) mm_ops->get_page(ctx->ptep); From patchwork Sat Nov 12 08:17:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13041032 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9020C4332F for ; Sat, 12 Nov 2022 08:17:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234714AbiKLIRZ (ORCPT ); Sat, 12 Nov 2022 03:17:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39668 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234747AbiKLIRX (ORCPT ); Sat, 12 Nov 2022 03:17:23 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05AB63FBAB for ; Sat, 12 Nov 2022 00:17:22 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-37010fefe48so63436517b3.19 for ; Sat, 12 Nov 2022 00:17:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kcOfLt54IsOfCoqPxZfn8w9KDI4Dutt/bdjE+3a1Swo=; b=MofK9VK31xsO6tMy+bgRKy919ODPakIfC9zHrPHXualkPJkauN/HfhYMirTVJXZm8N 7qTvmLxJUOmnw6xgPmtP9gAfuV13V6ofyNC8qso5hDcNO5+A01YEJZwjYahtcb7bMxdL X4aeN9FwW8UnZQC5t3eCnAPTOo2R5iQt8MQzey/B1xH2gGXycwewEfd2TtR9/hUyJX0K 3bG122ZA+BoNVJ1Eb25XftTDbj8OgTmn9FvJsY2tHlS0Qwt+dheLsQJ00WBpirO6f+CB 4qfT9l6WCPRj+I+QvgrGQa0ID0320vP0+ZKzIzAvYrgBuzhMjhZFgjd/yJSPKPp+aZtx df3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kcOfLt54IsOfCoqPxZfn8w9KDI4Dutt/bdjE+3a1Swo=; b=7wW1IE0ympcB5yyKWzrS+4eWekON0l+2iirQf7jK/Mq0S6iq6BoQBgTzORNCQ3FNHH Hhb24wF6Nyi6HeF/YlIABAZbfCXK8QIQEE3wpqTEgpjDpCpFdF2gR8FuaPo8pFifN1QX Pg1e92WBoy7jvWNJ47KMZnx9j1CmmuT19U85xYyGtwn6trN64yfP4TyTaCPpZolzqGck fJX69rhD35JFgy8pS+9rY5vDXobewPBFupnq3LU1IbT88vjg6FGTIC0meRojVZT/b1b1 tiXHqcuyH1mfFvHUVGZHa8jMp0tnLTEmymkCwa5/6BvoPhNdJgv/RMof5+qyJQR0o+Nc bTuQ== X-Gm-Message-State: ACrzQf3HyJJ8sIkiaU3rEdIRthPcLCJAk+94YrzO5xYSABWRhF03EnOp 6bWj6zDh3pLHSr4o40hhJKYlYivLM5xuFA== X-Google-Smtp-Source: AMsMyM5u8pnpy6TfMYS8Jx9s0FAOyCRD3hWvryIkECHvdmklRnVRM8e/SdEeNekkj4jlvpJtossjVmN6ofSxTQ== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a25:380b:0:b0:6d2:715f:9ca2 with SMTP id f11-20020a25380b000000b006d2715f9ca2mr38281508yba.532.1668241040674; Sat, 12 Nov 2022 00:17:20 -0800 (PST) Date: Sat, 12 Nov 2022 08:17:04 +0000 In-Reply-To: <20221112081714.2169495-1-ricarkol@google.com> Mime-Version: 1.0 References: <20221112081714.2169495-1-ricarkol@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221112081714.2169495-3-ricarkol@google.com> Subject: [RFC PATCH 02/12] KVM: arm64: Allow visiting block PTEs in post-order From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, dmatlack@google.com, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The page table walker does not visit block PTEs in post-order. But there are some cases where doing so would be beneficial, for example: breaking a 1G block PTE into a full tree in post-order avoids visiting the new tree. Allow post order visits of block PTEs. This will be used in a subsequent commit for eagerly breaking huge pages. Signed-off-by: Ricardo Koller --- arch/arm64/include/asm/kvm_pgtable.h | 4 ++-- arch/arm64/kvm/hyp/nvhe/setup.c | 2 +- arch/arm64/kvm/hyp/pgtable.c | 25 ++++++++++++------------- 3 files changed, 15 insertions(+), 16 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index e2edeed462e8..d2e4a5032146 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -255,7 +255,7 @@ struct kvm_pgtable { * entries. * @KVM_PGTABLE_WALK_TABLE_PRE: Visit table entries before their * children. - * @KVM_PGTABLE_WALK_TABLE_POST: Visit table entries after their + * @KVM_PGTABLE_WALK_POST: Visit leaf or table entries after their * children. * @KVM_PGTABLE_WALK_SHARED: Indicates the page-tables may be shared * with other software walkers. @@ -263,7 +263,7 @@ struct kvm_pgtable { enum kvm_pgtable_walk_flags { KVM_PGTABLE_WALK_LEAF = BIT(0), KVM_PGTABLE_WALK_TABLE_PRE = BIT(1), - KVM_PGTABLE_WALK_TABLE_POST = BIT(2), + KVM_PGTABLE_WALK_POST = BIT(2), KVM_PGTABLE_WALK_SHARED = BIT(3), }; diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index b47d969ae4d3..b0c1618d053b 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -265,7 +265,7 @@ static int fix_hyp_pgtable_refcnt(void) { struct kvm_pgtable_walker walker = { .cb = fix_hyp_pgtable_refcnt_walker, - .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST, + .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_POST, .arg = pkvm_pgtable.mm_ops, }; diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index b16107bf917c..1b371f6dbac2 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -206,16 +206,15 @@ static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data, if (!table) { data->addr = ALIGN_DOWN(data->addr, kvm_granule_size(level)); data->addr += kvm_granule_size(level); - goto out; + } else { + childp = (kvm_pteref_t)kvm_pte_follow(ctx.old, mm_ops); + ret = __kvm_pgtable_walk(data, mm_ops, childp, level + 1); + if (ret) + goto out; } - childp = (kvm_pteref_t)kvm_pte_follow(ctx.old, mm_ops); - ret = __kvm_pgtable_walk(data, mm_ops, childp, level + 1); - if (ret) - goto out; - - if (ctx.flags & KVM_PGTABLE_WALK_TABLE_POST) - ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_TABLE_POST); + if (ctx.flags & KVM_PGTABLE_WALK_POST) + ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_POST); out: return ret; @@ -494,7 +493,7 @@ u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size) struct kvm_pgtable_walker walker = { .cb = hyp_unmap_walker, .arg = &unmapped, - .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST, + .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_POST, }; if (!pgt->mm_ops->page_count) @@ -542,7 +541,7 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt) { struct kvm_pgtable_walker walker = { .cb = hyp_free_walker, - .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST, + .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_POST, }; WARN_ON(kvm_pgtable_walk(pgt, 0, BIT(pgt->ia_bits), &walker)); @@ -1003,7 +1002,7 @@ int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size) struct kvm_pgtable_walker walker = { .cb = stage2_unmap_walker, .arg = pgt, - .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST, + .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_POST, }; return kvm_pgtable_walk(pgt, addr, size, &walker); @@ -1234,7 +1233,7 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt) struct kvm_pgtable_walker walker = { .cb = stage2_free_walker, .flags = KVM_PGTABLE_WALK_LEAF | - KVM_PGTABLE_WALK_TABLE_POST, + KVM_PGTABLE_WALK_POST, }; WARN_ON(kvm_pgtable_walk(pgt, 0, BIT(pgt->ia_bits), &walker)); @@ -1249,7 +1248,7 @@ void kvm_pgtable_stage2_free_removed(struct kvm_pgtable_mm_ops *mm_ops, void *pg struct kvm_pgtable_walker walker = { .cb = stage2_free_walker, .flags = KVM_PGTABLE_WALK_LEAF | - KVM_PGTABLE_WALK_TABLE_POST, + KVM_PGTABLE_WALK_POST, }; struct kvm_pgtable_walk_data data = { .walker = &walker, From patchwork Sat Nov 12 08:17:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13041033 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3153C4321E for ; Sat, 12 Nov 2022 08:17:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234627AbiKLIR0 (ORCPT ); Sat, 12 Nov 2022 03:17:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234800AbiKLIRY (ORCPT ); Sat, 12 Nov 2022 03:17:24 -0500 Received: from mail-oo1-xc49.google.com (mail-oo1-xc49.google.com [IPv6:2607:f8b0:4864:20::c49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A9D0259FE2 for ; Sat, 12 Nov 2022 00:17:23 -0800 (PST) Received: by mail-oo1-xc49.google.com with SMTP id g1-20020a4ad841000000b0049f25cf96afso2272018oov.20 for ; Sat, 12 Nov 2022 00:17:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=sW8yzYiLN0rvycCwgJfhQm/Ug+OWU4SUgCCWanJUNo8=; b=NbEJZXpP3Niw2dcw2EFmfW807+eCgBhUP0IckjhfQgbDN6XAO7jbT/35Ci5xi5xeSs zn9W0/x22dmHzoZ6AWoIkI2NGtOx9xOJb+AHT6r/QPU5fcVH+a6kDgTqnjJPshRkr+1R 7P+TsLw5WN8Pg88KlWtGd2iYqOIMV1GLr4yzT/OTOSkK10DO82XVg1HERphKc5xt4VCQ +QIepxX4REPEeHTMDE2Ko/Kwn42KiuTqAYmnpQ07EewoG9rBKhwXYPQO2RCUQ+mLKmqn sSChsTrHLbTnpBGs52o+GqmW0idTx3OoJYIv2WyTV7QAFpp1V9UPyuHNNQqbc6E/aohs sqBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sW8yzYiLN0rvycCwgJfhQm/Ug+OWU4SUgCCWanJUNo8=; b=JiiDU9YwAeN3+L3JP5IaRXubMZ7LGJz7fMi97hziU4IxGBOXflTiNCN1oIxgKWTuPf oVrFDMRkR+EU8NVzhpzkItZoNwagyQnmD68vWJljD2B134HrMJPMKY93iFWj4HMupUaa J2TqACoCAohwhzsnSh/XsnIH18/IR4xGpbQB4PrT4vBCDrJAB26LpTf7y8h64EFwvM6N vpHO186UkxiWs73TG80JU8K7lwDaYUsUs3Hii8w2ZFQnS1GRZmwhOESRlTNuJP6N015z KD3QNaRLNZJBamJjKl+FoacsdA2KEWk4tC/zatTpWWd6PPaTHZgjWaesr3al04JkDdyo SmHA== X-Gm-Message-State: ANoB5pkqN39GWsajw8ljKPeNUY3ayJHqzW9OTxGFLf+CuN73vEp6Sn47 J/q4/lmEIr3llydwu+gtQ56TZUzGBvqxBQ== X-Google-Smtp-Source: AA0mqf6EmkVjDnUjkfNydVITp129iuw75i59sTZNS9in26BWrESzvhDqoX1KVMnkFyFaUWkj5fnscxncp0QoTA== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:aca:210c:0:b0:343:ad7:4a49 with SMTP id 12-20020aca210c000000b003430ad74a49mr2383552oiz.278.1668241043060; Sat, 12 Nov 2022 00:17:23 -0800 (PST) Date: Sat, 12 Nov 2022 08:17:05 +0000 In-Reply-To: <20221112081714.2169495-1-ricarkol@google.com> Mime-Version: 1.0 References: <20221112081714.2169495-1-ricarkol@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221112081714.2169495-4-ricarkol@google.com> Subject: [RFC PATCH 03/12] KVM: arm64: Add stage2_create_removed() From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, dmatlack@google.com, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a new stage2 function, stage2_create_removed(), for creating removed tables (the opposite of kvm_pgtable_stage2_free_removed()). Creating a removed table is useful for splitting block PTEs into tables. For example, a 1G block PTE can be split into 4K PTEs by first creating a fully populated tree, and then use it to replace the 1G PTE in a single step. This will be used in a subsequent commit for eager huge page splitting. No functional change intended. This new function will be used in a subsequent commit. Signed-off-by: Ricardo Koller --- arch/arm64/kvm/hyp/pgtable.c | 93 ++++++++++++++++++++++++++++++++++++ 1 file changed, 93 insertions(+) diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 1b371f6dbac2..d1f309128118 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1173,6 +1173,99 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size) return kvm_pgtable_walk(pgt, addr, size, &walker); } +struct stage2_create_removed_data { + void *memcache; + struct kvm_pgtable_mm_ops *mm_ops; + u64 phys; + kvm_pte_t attr; +}; + +/* + * This flag should only be used by the create_removed walker, as it would + * be misinterpreted it in an installed PTE. + */ +#define KVM_INVALID_PTE_NO_PAGE BIT(9) + +/* + * Failure to allocate a table results in setting the respective PTE with a + * valid block PTE instead of a table PTE. + */ +static int stage2_create_removed_walker(const struct kvm_pgtable_visit_ctx *ctx, + enum kvm_pgtable_walk_flags visit) +{ + struct stage2_create_removed_data *data = ctx->arg; + struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops; + u64 granule = kvm_granule_size(ctx->level); + kvm_pte_t attr = data->attr; + kvm_pte_t *childp = NULL; + u32 level = ctx->level; + int ret = 0; + + if (level < KVM_PGTABLE_MAX_LEVELS - 1) { + childp = mm_ops->zalloc_page(data->memcache); + ret = childp ? 0 : -ENOMEM; + } + + if (childp) + *ctx->ptep = kvm_init_table_pte(childp, mm_ops); + + /* + * Create a block PTE if we are at the max level, or if we failed + * to create a table (we are not at max level). + */ + if (level == KVM_PGTABLE_MAX_LEVELS - 1 || !childp) { + *ctx->ptep = kvm_init_valid_leaf_pte(data->phys, attr, level); + data->phys += granule; + } + + if (ctx->old != KVM_INVALID_PTE_NO_PAGE) + mm_ops->get_page(ctx->ptep); + + return ret; +} + +/* + * Create a removed page-table tree of PAGE_SIZE leaf PTEs under *ptep. + * This new page-table tree is not reachable (i.e., it is removed) from the + * root (the pgd). + * + * This function will try to create as many entries in the tree as allowed + * by the memcache capacity. It always writes a valid PTE into *ptep. In + * the best case, it returns 0 and a fully populated tree under *ptep. In + * the worst case, it returns -ENOMEM and *ptep will contain a valid block + * PTE covering the expected level, or any other valid combination (e.g., a + * 1G table PTE pointing to half 2M block PTEs and half 2M table PTEs). + */ +static int stage2_create_removed(kvm_pte_t *ptep, u64 phys, u32 level, + kvm_pte_t attr, void *memcache, + struct kvm_pgtable_mm_ops *mm_ops) +{ + struct stage2_create_removed_data alloc_data = { + .phys = phys, + .memcache = memcache, + .mm_ops = mm_ops, + .attr = attr, + }; + struct kvm_pgtable_walker walker = { + .cb = stage2_create_removed_walker, + .flags = KVM_PGTABLE_WALK_LEAF, + .arg = &alloc_data, + }; + struct kvm_pgtable_walk_data data = { + .walker = &walker, + + /* The IPA is irrelevant for a removed table. */ + .addr = 0, + .end = kvm_granule_size(level), + }; + + /* + * The walker should not try to get a reference to the memory + * holding this ptep (it's not a page). + */ + *ptep = KVM_INVALID_PTE_NO_PAGE; + return __kvm_pgtable_visit(&data, mm_ops, ptep, level); +} int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, struct kvm_pgtable_mm_ops *mm_ops, From patchwork Sat Nov 12 08:17:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13041034 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90951C4332F for ; Sat, 12 Nov 2022 08:17:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234809AbiKLIR1 (ORCPT ); Sat, 12 Nov 2022 03:17:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234617AbiKLIR0 (ORCPT ); Sat, 12 Nov 2022 03:17:26 -0500 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 365EA59FE2 for ; Sat, 12 Nov 2022 00:17:25 -0800 (PST) Received: by mail-pf1-x44a.google.com with SMTP id cw4-20020a056a00450400b00561ec04e77aso3824104pfb.12 for ; Sat, 12 Nov 2022 00:17:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WvQqbOcRRIz/a65YfrYqybuwAR7lzCQSMUK3wSeET0M=; b=LkFfKClWZ3DQ8LNaYJFEYJMH70YWUdqSFByma4dpsP65aCuCHTkXXM1y1a8uHQWgsQ C6oQq887h12aD6a3qPKeYmOV/TyUIvHDsQxoxbF3TTQpgtjjKFptWMqn9kTP/TomejZz NGr6lKfMW+/956h4TSvkm41L1g52Ri5Phi+4K0xyyrwsAG/m+zEPa6rjt9mUFh/GTMsW 9/mCvDVmwzr23PW9gvU5F8acyAcqEJvw6xm1SAy71DaSxI11NOFK6NYEwJ+/6OfqJgpa X/+qaFPJfb3gRYGJvsgQlPNYWEs4RE0EjGkq8C0U43XFVMlyfAY90JSpoBZByzfdITp4 pFoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WvQqbOcRRIz/a65YfrYqybuwAR7lzCQSMUK3wSeET0M=; b=bZ819+dbtYxDiWPHXALe6OFLt4Jw+yfYin7MpL9nyaE9ZJ6KNXcxDC8OXvEhwJVVa4 QcHBQKt7Tjv+KZTkuzkgDgIDdx9DCbUaFztdafXeSoubOHzALIXsIjCaHufbmPL5ClI+ 2Uv6IIZ4Rx4Ited6lqY2SrSBvY1qm+JE9H4T5bvGGeyFE32JQsmYlvJHOlIj/eOpD7Fk zpSHwIws2HcmaNuhFfW3j5ziGidSfDWKak2/m612ZxzVM163Bk5O26fNwZe3JjjE3SuC 5ixnj0B+60SRcNwsk7s8Ra/Jooi04Bw4Rc9c3AbO8LiLIu11jAD2JXgeuh0ErHxGptoz R8ng== X-Gm-Message-State: ANoB5pnvVuNQ5/4tB1jSSEjDzFkmeFVssa7M2bVbC0ep0ckV5P6pY1SZ PhCbu2VPH9lxqYyIUL39ED9HQY+SaQCghw== X-Google-Smtp-Source: AA0mqf5CBaojmhCyotBHoWR2qMVyeqzR1Pjirb9yO8vur3R/1ist1s7bbeFg75RVf1LANl9XifIznNL9Xa2a0g== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:aa7:8617:0:b0:56e:7424:bc0f with SMTP id p23-20020aa78617000000b0056e7424bc0fmr6225324pfn.11.1668241044591; Sat, 12 Nov 2022 00:17:24 -0800 (PST) Date: Sat, 12 Nov 2022 08:17:06 +0000 In-Reply-To: <20221112081714.2169495-1-ricarkol@google.com> Mime-Version: 1.0 References: <20221112081714.2169495-1-ricarkol@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221112081714.2169495-5-ricarkol@google.com> Subject: [RFC PATCH 04/12] KVM: arm64: Add kvm_pgtable_stage2_split() From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, dmatlack@google.com, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a new stage2 function, kvm_pgtable_stage2_split(), for splitting a range of huge pages. This will be used for eager-splitting huge pages into PAGE_SIZE pages. The goal is to avoid having to split huge pages on write-protection faults, and instead use this function to do it ahead of time for large ranges (e.g., all guest memory in 1G chunks at a time). No functional change intended. This new function will be used in a Signed-off-by: Ricardo Koller --- arch/arm64/include/asm/kvm_pgtable.h | 29 +++++++++++ arch/arm64/kvm/hyp/pgtable.c | 74 ++++++++++++++++++++++++++++ 2 files changed, 103 insertions(+) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index d2e4a5032146..396ebb0949fb 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -594,6 +594,35 @@ bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr); */ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size); +/** + * kvm_pgtable_stage2_split() - Split a range of huge pages into leaf PTEs pointing + * to PAGE_SIZE guest pages. + * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init*(). + * @addr: Intermediate physical address from which to split. + * @size: Size of the range. + * @mc: Cache of pre-allocated and zeroed memory from which to allocate + * page-table pages. + * + * @addr and the end (@addr + @size) are effectively aligned down and up to + * the top level huge-page block size. This is an exampe using 1GB + * huge-pages and 4KB granules. + * + * [---input range---] + * : : + * [--1G block pte--][--1G block pte--][--1G block pte--][--1G block pte--] + * : : + * [--2MB--][--2MB--][--2MB--][--2MB--] + * : : + * [ ][ ][:][ ][ ][ ][ ][ ][:][ ][ ][ ] + * : : + * + * Return: 0 on success, negative error code on failure. Note that + * kvm_pgtable_stage2_split() is best effort: it tries to break as many + * blocks in the input range as allowed by the size of the memcache. It + * will fail it wasn't able to break any block. + */ +int kvm_pgtable_stage2_split(struct kvm_pgtable *pgt, u64 addr, u64 size, void *mc); + /** * kvm_pgtable_walk() - Walk a page-table. * @pgt: Page-table structure initialised by kvm_pgtable_*_init(). diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index d1f309128118..9c42eff6d42e 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1267,6 +1267,80 @@ static int stage2_create_removed(kvm_pte_t *ptep, u64 phys, u32 level, return __kvm_pgtable_visit(&data, mm_ops, ptep, level); } +struct stage2_split_data { + struct kvm_s2_mmu *mmu; + void *memcache; + struct kvm_pgtable_mm_ops *mm_ops; +}; + +static int stage2_split_walker(const struct kvm_pgtable_visit_ctx *ctx, + enum kvm_pgtable_walk_flags visit) +{ + struct stage2_split_data *data = ctx->arg; + struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops; + kvm_pte_t pte = ctx->old, attr, new; + enum kvm_pgtable_prot prot; + void *mc = data->memcache; + u32 level = ctx->level; + u64 phys; + + if (WARN_ON_ONCE(kvm_pgtable_walk_shared(ctx))) + return -EINVAL; + + /* Nothing to split at the last level */ + if (level == KVM_PGTABLE_MAX_LEVELS - 1) + return 0; + + /* We only split valid block mappings */ + if (!kvm_pte_valid(pte) || kvm_pte_table(pte, ctx->level)) + return 0; + + phys = kvm_pte_to_phys(pte); + prot = kvm_pgtable_stage2_pte_prot(pte); + stage2_set_prot_attr(data->mmu->pgt, prot, &attr); + + /* + * Eager page splitting is best-effort, so we can ignore the error. + * The returned PTE (new) will be valid even if this call returns + * error: new will be a single (big) block PTE. The only issue is + * that it will affect dirty logging performance, as the huge-pages + * will have to be split on fault, and so we WARN. + */ + WARN_ON(stage2_create_removed(&new, phys, level, attr, mc, mm_ops)); + + stage2_put_pte(ctx, data->mmu, mm_ops); + + /* + * Note, the contents of the page table are guaranteed to be made + * visible before the new PTE is assigned because + * stage2_make__pte() writes the PTE using smp_store_release(). + */ + stage2_make_pte(ctx, new); + dsb(ishst); + return 0; +} + +int kvm_pgtable_stage2_split(struct kvm_pgtable *pgt, + u64 addr, u64 size, void *mc) +{ + int ret; + + struct stage2_split_data split_data = { + .mmu = pgt->mmu, + .memcache = mc, + .mm_ops = pgt->mm_ops, + }; + + struct kvm_pgtable_walker walker = { + .cb = stage2_split_walker, + .flags = KVM_PGTABLE_WALK_POST, + .arg = &split_data, + }; + + ret = kvm_pgtable_walk(pgt, addr, size, &walker); + return ret; +} + int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, struct kvm_pgtable_mm_ops *mm_ops, enum kvm_pgtable_stage2_flags flags, From patchwork Sat Nov 12 08:17:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13041035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D2D5C43219 for ; Sat, 12 Nov 2022 08:17:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234805AbiKLIR2 (ORCPT ); Sat, 12 Nov 2022 03:17:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234747AbiKLIR1 (ORCPT ); Sat, 12 Nov 2022 03:17:27 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C12085BD47 for ; Sat, 12 Nov 2022 00:17:26 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id g13-20020a056a000b8d00b0056e28b15757so3798510pfj.1 for ; Sat, 12 Nov 2022 00:17:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=erREdX/KDsXecKj6gAJQc8HQGGxLcgScAAv0CqZULEU=; b=YpZw7R2gC5bvKwjGM8NpdnFt1KqwMZgN0FK+q3Tcnq4RSpkrJ5mlI8gWdBgR1P3PNY wtWzQsCJAqHyfWbM4T6Blssr1vYpjmXg8r3avJ3Ag1eRynebA0swGQiLy+T/1xOLVqmx uKzeeAOCidZWlofpDaepHSuA37EGmUdkY+nB9QsBoniCzied3NX2L3xwX+W+QenR+ZG1 qe7sBJhQop7ZDaWqme1bLG2j1DUYNgOzuWcEQ4r62GNmnVKY3n/YokKrJD6DndIW86EL O5F8B+cOkhxiU1R0kRa5Etvr8wVBsf/tf0E3jTUURwKOs26mJNks+CHuJUcJpad+jjix 9UEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=erREdX/KDsXecKj6gAJQc8HQGGxLcgScAAv0CqZULEU=; b=LUIYUcxMIKoG4r0C39YQ5g52ZS06s9W67aOGQUZ48AuhAsQC1dDTyU45W0CuKkASzJ AovWmxpeoUl9XCR3t146GB02oXPOXyL3VLNgolexri0/HjIK/3OORIJTjZIzkY20CRF2 hzjby6d6fdvLcpvImoenmUFKgsJrhsGBjC1g2KRmNmm9aO4tHfyXtH9a9ALdoJkJLX4Z B8DOQY8kPXJnD4n8bGMEGpvZiNPnVRamlgcnxIqnTJMj5ZZ/S96JzgLQHmKL6rDJLg2q CrknTAAFEj5zjaAe/MC41qCPG60K2jMCCiAkxEwULNweJ+csO/56ZYvwRzHqxuOmAK4+ sTRw== X-Gm-Message-State: ANoB5pkkGtsaBXlYHbmQidF+As+EO7gFxlvOXVlptJGLewNE6lcOq7wf ZFZJX+sK1k07IJuLgJAzzJTdIyVWMN7ipg== X-Google-Smtp-Source: AA0mqf6QQp6GfMYPE6XZxs/+skS8K7KjLSTBecvWkfOO3UbLxYSWSVwUiwHGGOFgupv2VbGHKjubxmxwv9+J9A== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:aa7:92ca:0:b0:56c:6bcc:cf0e with SMTP id k10-20020aa792ca000000b0056c6bcccf0emr6073185pfa.64.1668241046330; Sat, 12 Nov 2022 00:17:26 -0800 (PST) Date: Sat, 12 Nov 2022 08:17:07 +0000 In-Reply-To: <20221112081714.2169495-1-ricarkol@google.com> Mime-Version: 1.0 References: <20221112081714.2169495-1-ricarkol@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221112081714.2169495-6-ricarkol@google.com> Subject: [RFC PATCH 05/12] arm64: Add a capability for FEAT_BBM level 2 From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, dmatlack@google.com, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a new capability to detect "Stage-2 Translation table break-before-make" (FEAT_BBM) level 2. Signed-off-by: Ricardo Koller --- arch/arm64/kernel/cpufeature.c | 11 +++++++++++ arch/arm64/tools/cpucaps | 1 + 2 files changed, 12 insertions(+) diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 6062454a9067..ff97fb05d430 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2339,6 +2339,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .min_field_value = 1, .matches = has_cpuid_feature, }, + { + .desc = "Stage-2 Translation table break-before-make level 2", + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .capability = ARM64_HAS_STAGE2_BBM2, + .sys_reg = SYS_ID_AA64MMFR2_EL1, + .sign = FTR_UNSIGNED, + .field_pos = ID_AA64MMFR2_EL1_BBM_SHIFT, + .field_width = 4, + .min_field_value = 2, + .matches = has_cpuid_feature, + }, { .desc = "TLB range maintenance instructions", .capability = ARM64_HAS_TLB_RANGE, diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index f1c0347ec31a..f421adbdb08b 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -36,6 +36,7 @@ HAS_PAN HAS_RAS_EXTN HAS_RNG HAS_SB +HAS_STAGE2_BBM2 HAS_STAGE2_FWB HAS_SYSREG_GIC_CPUIF HAS_TIDCP1 From patchwork Sat Nov 12 08:17:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13041036 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B1F2C4332F for ; Sat, 12 Nov 2022 08:17:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234826AbiKLIRb (ORCPT ); Sat, 12 Nov 2022 03:17:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234811AbiKLIR3 (ORCPT ); Sat, 12 Nov 2022 03:17:29 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6EE1813F25 for ; Sat, 12 Nov 2022 00:17:28 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id c1-20020a170902d48100b0018723580343so5059281plg.15 for ; Sat, 12 Nov 2022 00:17:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=L6sxJiifG/HbcI/kD9HXwh1w5rUr/VbuNcHdSSEcRpA=; b=FacScKAelNZrGtgwl8w+3ds6vKtyYp808JuUQWEzIIVYOOJlOjcRWuUTvbIoA3BtvV K3J0uCUKlxt5wvgfQMZm7xPR5YfJItnFP1kjZcOT+7su8jLIaXVWm+n4P2LFU8KIzFHW m4f/I1VNmAFezxGI67Z+iuWEGpH2YJQlhAlB7DTz7PqrUnrF23WwFZWVgBUTNFZIoAr4 +g5eCvtaOf6vZoUeLPvZHrqlf9C3sd2ywaN0FRXaPkfbvTu1wdg314O5h+cGnkcOK3hB alv5KrakCKq94WoxhQ6EWq84sNAGs+cFc11nr109zSRgOi4O5L2vDw39F05glpPR+m23 /9eQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=L6sxJiifG/HbcI/kD9HXwh1w5rUr/VbuNcHdSSEcRpA=; b=kQzYYeOPDwmQg+9zD919CQzx6ypG6tVdQg6+uHTQCWsrnBEmrMWCZnrd4FNLciUxay kl+lS/bKoG/ZPcjkUqZjkisWGHE8DAgaXvZ1vgaojS3VstJllbuLDZIPF59akzzsNRhL OUgUVq4b0r4Va9/49/bz5juJJGVv2q1P6hjJu4SGiewZkcy8WawN7OZkPQrfpz0qRKI6 8h3EWEZYmR1izovQ2i+CKpYcExny14KU5buS8CAy1Nf1j/GAp1YYYVtsgYJyKPANDlPV 9ftI1onKIrVO5ceoXcXZLhS6y3I3Yfy6LYO9YKBvI4bofPsy1CoTvXpaGNe48IwK9MCi bi+w== X-Gm-Message-State: ANoB5pnji0HAaO7M6k172PZdBN9VzZSwXKifnTBrqO5cNDr4oC+Y151H ALkU7gEmp+0A92J5FdagUFI+epZV56B6PQ== X-Google-Smtp-Source: AA0mqf7aQpUifUSeHmW7DiqEX55OKzbVwNTuSNUOb6rC8J81HSb9LyTNzd73+bZceLSa4TYbJsj7t2D3QIZ4qw== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a17:902:7e01:b0:186:b38a:9c4a with SMTP id b1-20020a1709027e0100b00186b38a9c4amr5537618plm.163.1668241047863; Sat, 12 Nov 2022 00:17:27 -0800 (PST) Date: Sat, 12 Nov 2022 08:17:08 +0000 In-Reply-To: <20221112081714.2169495-1-ricarkol@google.com> Mime-Version: 1.0 References: <20221112081714.2169495-1-ricarkol@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221112081714.2169495-7-ricarkol@google.com> Subject: [RFC PATCH 06/12] KVM: arm64: Split block PTEs without using break-before-make From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, dmatlack@google.com, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Breaking a huge-page block PTE into an equivalent table of smaller PTEs does not require using break-before-make (BBM) when FEAT_BBM level 2 is implemented. Add the respective check for eager page splitting and avoid using BBM. Also take care of possible Conflict aborts. According to the rules specified in the Arm ARM (DDI 0487H.a) section "Support levels for changing block size" D5.10.1, this can result in a Conflict abort. So, handle it by clearing all VM TLB entries. Signed-off-by: Ricardo Koller --- arch/arm64/include/asm/esr.h | 1 + arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/kvm/hyp/pgtable.c | 10 +++++++++- arch/arm64/kvm/mmu.c | 6 ++++++ 4 files changed, 17 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h index 15b34fbfca66..6f5b976396e7 100644 --- a/arch/arm64/include/asm/esr.h +++ b/arch/arm64/include/asm/esr.h @@ -114,6 +114,7 @@ #define ESR_ELx_FSC_ACCESS (0x08) #define ESR_ELx_FSC_FAULT (0x04) #define ESR_ELx_FSC_PERM (0x0C) +#define ESR_ELx_FSC_CONFLICT (0x30) /* ISS field definitions for Data Aborts */ #define ESR_ELx_ISV_SHIFT (24) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 0df3fc3a0173..58e7cbe3c250 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -333,6 +333,7 @@ #define FSC_SECC_TTW1 (0x1d) #define FSC_SECC_TTW2 (0x1e) #define FSC_SECC_TTW3 (0x1f) +#define FSC_CONFLICT ESR_ELx_FSC_CONFLICT /* Hyp Prefetch Fault Address Register (HPFAR/HDFAR) */ #define HPFAR_MASK (~UL(0xf)) diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 9c42eff6d42e..36b81df5687e 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1267,6 +1267,11 @@ static int stage2_create_removed(kvm_pte_t *ptep, u64 phys, u32 level, return __kvm_pgtable_visit(&data, mm_ops, ptep, level); } +static bool stage2_has_bbm_level2(void) +{ + return cpus_have_const_cap(ARM64_HAS_STAGE2_BBM2); +} + struct stage2_split_data { struct kvm_s2_mmu *mmu; void *memcache; @@ -1308,7 +1313,10 @@ static int stage2_split_walker(const struct kvm_pgtable_visit_ctx *ctx, */ WARN_ON(stage2_create_removed(&new, phys, level, attr, mc, mm_ops)); - stage2_put_pte(ctx, data->mmu, mm_ops); + if (stage2_has_bbm_level2()) + mm_ops->put_page(ctx->ptep); + else + stage2_put_pte(ctx, data->mmu, mm_ops); /* * Note, the contents of the page table are guaranteed to be made diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 8f26c65693a9..318f7b0aa20b 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1481,6 +1481,12 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) return 1; } + /* Conflict abort? */ + if (fault_status == FSC_CONFLICT) { + kvm_flush_remote_tlbs(vcpu->kvm); + return 1; + } + trace_kvm_guest_fault(*vcpu_pc(vcpu), kvm_vcpu_get_esr(vcpu), kvm_vcpu_get_hfar(vcpu), fault_ipa); From patchwork Sat Nov 12 08:17:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13041037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A706AC43219 for ; Sat, 12 Nov 2022 08:17:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234837AbiKLIRg (ORCPT ); Sat, 12 Nov 2022 03:17:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234830AbiKLIRc (ORCPT ); Sat, 12 Nov 2022 03:17:32 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 718C15BD57 for ; Sat, 12 Nov 2022 00:17:30 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-370624ca2e8so63660857b3.16 for ; Sat, 12 Nov 2022 00:17:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=XowvsvOcOv2Dze5DffbrE0m7myhjg52x/AYqKu/6H0Q=; b=EsW5Gdp75rU2DUuHrGdfowsd3Xg7XJrLSdam+fpUtlwTdh1xrokqMFBjDQfu3ZSN5v 3WNIAfsqOLgz9ApP/xZH1HRwWly+toBAPueZXyk9cB7bHX4lZw9gZFw4kGeCmEHf6xww tgIkAGZLt+9amtGXX3/mSDXUkXfTwEy7EGY+SR6BC7T6OygBkq45p4xA35d9sDiVisW6 rs966VJsnUdF16OGJeOV5eAMSKeo90JTQ5w2yVogPJnhnlnsJXskfaWTBhhwD2JHSNut 04cT72ye0o4MEN0FBYX5wuczQ8IkaZcFUgY2DAv5Ajod7vm1m9BGTFuDSumglosNjDzp 7GZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XowvsvOcOv2Dze5DffbrE0m7myhjg52x/AYqKu/6H0Q=; b=OOcemz5KZ56M0HrnLNwSkOhGCBlsH7kyCwnxnteQpG3/JlkAV+X8Ji7+UU6yF4Sbnr Dv/8uqHzO8/ur5/RZgn6oF6I+2WrHGavxO6K0wZxR+KCD5tnZRNDIv17XhGuSKUPtjPF 5t9PUONiljZ9hR3GZ3C55evrGhFhgdsF3qZcemKhrrBeyspjlaHP3TB1fOplbQ16epNh mIxCwgzSgU6nz4q7mmIF1KkHHmYXBP7+TW0tXLdCpd88WJtI4oYJpN2hBdQbn2NIj17I GOjsbsFNXyOT/8oxWY6rNy4ZrcmjbYEDDURfHkFT5Qpe81quZH6WsgUZz63bfOU3zGgn +f/Q== X-Gm-Message-State: ANoB5pkeUPr70cLnQG0XQZrsX9I7c1DdbY+K8aCJsplruSkj7i5sNzU5 Ob/uoEXAbCgUEaYFy9/q1a/5NwqhjqzkRg== X-Google-Smtp-Source: AA0mqf5J4lgPyBB5y+f3Nq56r3EX+3f3XVFGVpba92grsCihrLzXFlwxYutDNVRoff367ywYqhaBZFCdb+AbvQ== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a0d:cc86:0:b0:368:738a:b13b with SMTP id o128-20020a0dcc86000000b00368738ab13bmr5277155ywd.97.1668241049641; Sat, 12 Nov 2022 00:17:29 -0800 (PST) Date: Sat, 12 Nov 2022 08:17:09 +0000 In-Reply-To: <20221112081714.2169495-1-ricarkol@google.com> Mime-Version: 1.0 References: <20221112081714.2169495-1-ricarkol@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221112081714.2169495-8-ricarkol@google.com> Subject: [RFC PATCH 07/12] KVM: arm64: Refactor kvm_arch_commit_memory_region() From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, dmatlack@google.com, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Refactor kvm_arch_commit_memory_region() as a preparation for a future commit to look cleaner and more understandable. Also, it looks more like its x86 counterpart (in kvm_mmu_slot_apply_flags()). No functional change intended. Signed-off-by: Ricardo Koller --- arch/arm64/kvm/mmu.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 318f7b0aa20b..6599a45eebf5 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1770,20 +1770,27 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, const struct kvm_memory_slot *new, enum kvm_mr_change change) { + bool log_dirty_pages = new && new->flags & KVM_MEM_LOG_DIRTY_PAGES; + /* * At this point memslot has been committed and there is an * allocated dirty_bitmap[], dirty pages will be tracked while the * memory slot is write protected. */ - if (change != KVM_MR_DELETE && new->flags & KVM_MEM_LOG_DIRTY_PAGES) { + if (log_dirty_pages) { + + if (change == KVM_MR_DELETE) + return; + /* * If we're with initial-all-set, we don't need to write * protect any pages because they're all reported as dirty. * Huge pages and normal pages will be write protect gradually. */ - if (!kvm_dirty_log_manual_protect_and_init_set(kvm)) { - kvm_mmu_wp_memory_region(kvm, new->id); - } + if (kvm_dirty_log_manual_protect_and_init_set(kvm)) + return; + + kvm_mmu_wp_memory_region(kvm, new->id); } } From patchwork Sat Nov 12 08:17:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13041038 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5256AC4332F for ; Sat, 12 Nov 2022 08:17:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234794AbiKLIRi (ORCPT ); Sat, 12 Nov 2022 03:17:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39920 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234811AbiKLIRc (ORCPT ); Sat, 12 Nov 2022 03:17:32 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11EF15BD5F for ; Sat, 12 Nov 2022 00:17:32 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-370624ca2e8so63661177b3.16 for ; Sat, 12 Nov 2022 00:17:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=RSyLO/J2OZ8IR7rCygIEQwqWF34HE/cmjvAFbCy9UCk=; b=XbNV/sOYK22JdWgiTH2gTYZIZ2JAdMLBk35z3CQYb9CTn32I5GDB7F2rCRA/X8mLqw hH3HU3TvXgNkR/bnpfpu2xp6vtr7lVL3bn9y1FtnzGCPsCxDYpfzSX0KpT5zMfrzBvnY MK9M3SBR81XIYkfchgW8VkXotC21Tf5BmVzIfh6qF9tjayEV1QQiaWYoxaHXShhM3W9a ZtA4yi5lV73TrxL75et+sWVbmdDdPGmHgYhLGiQqaxlNGq/N9YpYbdSQ1km43Ko86JXl fqYEOvtyvPKXP/TJ0YAuWW0VmMtXNAwuZ16w3/1bB30kWBhrPswOXP6F2LdSVkCzxiq7 /k+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RSyLO/J2OZ8IR7rCygIEQwqWF34HE/cmjvAFbCy9UCk=; b=mSaSbWLzMwYZp0+lNDiEvFCF1WXpHXVxDltzal+sZwpPJptqXTSh5aX45Og4ucXkvr PNc32o+Hx+WhU68Jmqh1VAMI9U/9DOlNF8u+r/cI/lc74BAAg9VsbnNKsFK3Xb1i+R9m xb1ChA6SByuhxcl0MtCTARiOYTYnIfnkReFvsUMLBq7zTfsDylCCvT2WSmFpbCj2n7ee ipADy21R4+B0/C+oOFJ/+AYM5rGnhfLB2pd/S/bl88rc2evqk3NJeLDZhmjtim3mfz8o 3eDw7gYnrVCCCq8L6vfQwx3rvwhTqfU17eW5iESrIbPVb2u9L2+FbdfKb0rxq6ratm5t ejwQ== X-Gm-Message-State: ANoB5pnID3bUbHr9ucj9mM9VSLBC1y4ow/ZnvLRBihwr2HJ9MgVM/fjh mq9AiSf2Fxp7nX0bWM/2FstfYjDW778TsQ== X-Google-Smtp-Source: AA0mqf4ZMSd0IbLoI1AkRzAZK/OQcoH6q02SDUxTYBq2QG+Ba1y2BEB32tCAZcmlr2OLtCGGticIz09bJ4TPjw== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a25:55d4:0:b0:6ca:363a:9a1b with SMTP id j203-20020a2555d4000000b006ca363a9a1bmr5130852ybb.1.1668241051339; Sat, 12 Nov 2022 00:17:31 -0800 (PST) Date: Sat, 12 Nov 2022 08:17:10 +0000 In-Reply-To: <20221112081714.2169495-1-ricarkol@google.com> Mime-Version: 1.0 References: <20221112081714.2169495-1-ricarkol@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221112081714.2169495-9-ricarkol@google.com> Subject: [RFC PATCH 08/12] KVM: arm64: Add kvm_uninit_stage2_mmu() From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, dmatlack@google.com, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add kvm_uninit_stage2_mmu() and move kvm_free_stage2_pgd() into it. A future commit will add some more things to do inside of kvm_uninit_stage2_mmu(). No functional change intended. Signed-off-by: Ricardo Koller --- arch/arm64/include/asm/kvm_mmu.h | 1 + arch/arm64/kvm/mmu.c | 7 ++++++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index e4a7e6369499..058f3ae5bc26 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -167,6 +167,7 @@ void free_hyp_pgds(void); void stage2_unmap_vm(struct kvm *kvm); int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long type); +void kvm_uninit_stage2_mmu(struct kvm *kvm); void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu); int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, phys_addr_t pa, unsigned long size, bool writable); diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 6599a45eebf5..94865c5ce181 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -766,6 +766,11 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t return err; } +void kvm_uninit_stage2_mmu(struct kvm *kvm) +{ + kvm_free_stage2_pgd(&kvm->arch.mmu); +} + static void stage2_unmap_memslot(struct kvm *kvm, struct kvm_memory_slot *memslot) { @@ -1869,7 +1874,7 @@ void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) void kvm_arch_flush_shadow_all(struct kvm *kvm) { - kvm_free_stage2_pgd(&kvm->arch.mmu); + kvm_uninit_stage2_mmu(kvm); } void kvm_arch_flush_shadow_memslot(struct kvm *kvm, From patchwork Sat Nov 12 08:17:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13041039 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A67AC4332F for ; Sat, 12 Nov 2022 08:17:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234865AbiKLIRs (ORCPT ); Sat, 12 Nov 2022 03:17:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234834AbiKLIRf (ORCPT ); Sat, 12 Nov 2022 03:17:35 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AAAA35BD6C for ; Sat, 12 Nov 2022 00:17:33 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id x11-20020a056a000bcb00b0056c6ec11eefso3834812pfu.14 for ; Sat, 12 Nov 2022 00:17:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/nyvYrLnUX10u/zF5USAFv55ycigc2WUWemyEOx8j2Q=; b=a9DgRqDEG78Q13PysBXgEQHBs3v+AOKF3ZTp/R3w1pYSl6K1z/1HUaMcMOWWsQc7GK 7WV7hPRPYBvb4SZ++risseWFaFTi/LgWPuL1hJoO0URpmybMW/tMCIIiA0WAvD/+ktWe /frWILdNkZ1lRQzSGKAe6ls9M+5RpX4iXAg+QYAFI4a/O6eH1pBASsjbov4yefay9gKb V//B3z75dq2OucbGBe+U71M2J7jiSNFLfHUAHMFXeH840ZgJhiILhtTpmTeh9vpTA8s+ 4VFTxNdkqJ9f1Y6nsAwG35z6jKXz+Kh+8MEayqTVbvvrUYJ8kcpl8t5EndQ70b0XyGqA qxPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/nyvYrLnUX10u/zF5USAFv55ycigc2WUWemyEOx8j2Q=; b=KEWZkSVpImmXLMwrJg7Liv38EVM7Vyc9q0HujibrSopDMloAoK9Ft299nIhIFfcvxk 4UySpdxmIT54g3DTT2Nrp656hFm1ler6116zrUUYhavaECKI7Yds5GaZ6IkhVN0d69uk w2AcKBOPdtosJSszQJ1IyPXxAWCUE54bvY2b8moK7c6wcIZroU2nK5FLHGtdPic496Qw tR8iSvuccBml18nRzO4Py2tuD0nkZ9ECOPrBFt2jozx6yPdY9zgRMfA4OGq88foG2vaT wovWgSBG3a2PUo3tph6PIkU8bl6ZzLfG/wrb+kquVOo0pxf+XQii3MPfcoMTh7oLkXgX Bd0w== X-Gm-Message-State: ANoB5pkv02w7gjheT57HwUbHv8QVakDRPWNkKu/sIEx9EU+Lrp/ybmVT 1vvyHFQCG4Gz6YNQ8v/1u1vVlIl4Mg3Tug== X-Google-Smtp-Source: AA0mqf5W+cKOm4H2QUN9ELMxJnqbqLO4qKu4CKYJGn0F2XaHfpwZz+cFqTYqhYawS6t216IFJFGztOsAzPbN5w== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a05:6a00:4187:b0:56e:3a98:1089 with SMTP id ca7-20020a056a00418700b0056e3a981089mr6053383pfb.38.1668241053045; Sat, 12 Nov 2022 00:17:33 -0800 (PST) Date: Sat, 12 Nov 2022 08:17:11 +0000 In-Reply-To: <20221112081714.2169495-1-ricarkol@google.com> Mime-Version: 1.0 References: <20221112081714.2169495-1-ricarkol@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221112081714.2169495-10-ricarkol@google.com> Subject: [RFC PATCH 09/12] KVM: arm64: Split huge pages when dirty logging is enabled From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, dmatlack@google.com, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Split huge pages eagerly when enabling dirty logging. The goal is to avoid doing it while faulting on write-protected pages, which negatively impacts guest performance. A memslot marked for dirty logging is split in 1GB pieces at a time. This is in order to release the mmu_lock and give other kernel threads the opportunity to run, and also in order to allocate enough pages to split a 1GB range worth of huge pages (or a single 1GB huge page). Note that these page allocations can fail, so eager page splitting is best-effort. This is not a correctness issue though, as huge pages can still be split on write-faults. The benefits of eager page splitting are the same as in x86, added with commit a3fe5dbda0a4 ("KVM: x86/mmu: Split huge pages mapped by the TDP MMU when dirty logging is enabled"). For example, when running dirty_log_perf_test with 64 virtual CPUs (Ampere Altra), 1GB per vCPU, 50% reads, and 2MB HugeTLB memory, the time it takes vCPUs to access all of their memory after dirty logging is enabled decreased by 44% from 2.58s to 1.42s. Signed-off-by: Ricardo Koller --- arch/arm64/include/asm/kvm_host.h | 30 ++++++++ arch/arm64/kvm/mmu.c | 110 +++++++++++++++++++++++++++++- 2 files changed, 138 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 63307e7dc9c5..d43f133518cf 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -153,6 +153,36 @@ struct kvm_s2_mmu { /* The last vcpu id that ran on each physical CPU */ int __percpu *last_vcpu_ran; + /* + * Memory cache used to split EAGER_PAGE_SPLIT_CHUNK_SIZE worth of huge + * pages. It is used to allocate stage2 page tables while splitting + * huge pages. Its capacity should be EAGER_PAGE_SPLIT_CACHE_CAPACITY. + * Note that the choice of EAGER_PAGE_SPLIT_CHUNK_SIZE influences both + * the capacity of the split page cache (CACHE_CAPACITY), and how often + * KVM reschedules. Be wary of raising CHUNK_SIZE too high. + * + * A good heuristic to pick CHUNK_SIZE is that it should be larger than + * all the available huge-page sizes, and be a multiple of all the + * other ones; for example, 1GB when all the available huge-page sizes + * are (1GB, 2MB, 32MB, 512MB). + * + * CACHE_CAPACITY should have enough pages to cover CHUNK_SIZE; for + * example, 1GB requires the following number of PAGE_SIZE-pages: + * - 512 when using 2MB hugepages with 4KB granules (1GB / 2MB). + * - 513 when using 1GB hugepages with 4KB granules (1 + (1GB / 2MB)). + * - 32 when using 32MB hugepages with 16KB granule (1GB / 32MB). + * - 2 when using 512MB hugepages with 64KB granules (1GB / 512MB). + * CACHE_CAPACITY below assumes the worst case: 1GB hugepages with 4KB + * granules. + * + * Protected by kvm->slots_lock. + */ +#define EAGER_PAGE_SPLIT_CHUNK_SIZE SZ_1G +#define EAGER_PAGE_SPLIT_CACHE_CAPACITY \ + (DIV_ROUND_UP_ULL(EAGER_PAGE_SPLIT_CHUNK_SIZE, SZ_1G) + \ + DIV_ROUND_UP_ULL(EAGER_PAGE_SPLIT_CHUNK_SIZE, SZ_2M)) + struct kvm_mmu_memory_cache split_page_cache; + struct kvm_arch *arch; }; diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 94865c5ce181..f2753d9deb19 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -31,14 +31,24 @@ static phys_addr_t hyp_idmap_vector; static unsigned long io_map_base; -static phys_addr_t stage2_range_addr_end(phys_addr_t addr, phys_addr_t end) +bool __read_mostly eager_page_split = true; +module_param(eager_page_split, bool, 0644); + +static phys_addr_t __stage2_range_addr_end(phys_addr_t addr, phys_addr_t end, + phys_addr_t size) { - phys_addr_t size = kvm_granule_size(KVM_PGTABLE_MIN_BLOCK_LEVEL); phys_addr_t boundary = ALIGN_DOWN(addr + size, size); return (boundary - 1 < end - 1) ? boundary : end; } +static phys_addr_t stage2_range_addr_end(phys_addr_t addr, phys_addr_t end) +{ + phys_addr_t size = kvm_granule_size(KVM_PGTABLE_MIN_BLOCK_LEVEL); + + return __stage2_range_addr_end(addr, end, size); +} + /* * Release kvm_mmu_lock periodically if the memory region is large. Otherwise, * we may see kernel panics with CONFIG_DETECT_HUNG_TASK, @@ -71,6 +81,64 @@ static int stage2_apply_range(struct kvm *kvm, phys_addr_t addr, return ret; } +static inline bool need_topup(struct kvm_mmu_memory_cache *cache, int min) +{ + return kvm_mmu_memory_cache_nr_free_objects(cache) < min; +} + +static bool need_topup_split_page_cache_or_resched(struct kvm *kvm) +{ + struct kvm_mmu_memory_cache *cache; + + if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) + return true; + + cache = &kvm->arch.mmu.split_page_cache; + return need_topup(cache, EAGER_PAGE_SPLIT_CACHE_CAPACITY); +} + +static int kvm_mmu_split_huge_pages(struct kvm *kvm, phys_addr_t addr, + phys_addr_t end) +{ + struct kvm_mmu_memory_cache *cache; + struct kvm_pgtable *pgt; + int ret; + u64 next; + int cache_capacity = EAGER_PAGE_SPLIT_CACHE_CAPACITY; + + lockdep_assert_held_write(&kvm->mmu_lock); + + lockdep_assert_held(&kvm->slots_lock); + + cache = &kvm->arch.mmu.split_page_cache; + + do { + if (need_topup_split_page_cache_or_resched(kvm)) { + write_unlock(&kvm->mmu_lock); + cond_resched(); + /* Eager page splitting is best-effort. */ + ret = __kvm_mmu_topup_memory_cache(cache, + cache_capacity, + cache_capacity); + write_lock(&kvm->mmu_lock); + if (ret) + break; + } + + pgt = kvm->arch.mmu.pgt; + if (!pgt) + return -EINVAL; + + next = __stage2_range_addr_end(addr, end, + EAGER_PAGE_SPLIT_CHUNK_SIZE); + ret = kvm_pgtable_stage2_split(pgt, addr, next - addr, cache); + if (ret) + break; + } while (addr = next, addr != end); + + return ret; +} + #define stage2_apply_range_resched(kvm, addr, end, fn) \ stage2_apply_range(kvm, addr, end, fn, true) @@ -755,6 +823,8 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t for_each_possible_cpu(cpu) *per_cpu_ptr(mmu->last_vcpu_ran, cpu) = -1; + mmu->split_page_cache.gfp_zero = __GFP_ZERO; + mmu->pgt = pgt; mmu->pgd_phys = __pa(pgt->pgd); return 0; @@ -769,6 +839,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t void kvm_uninit_stage2_mmu(struct kvm *kvm) { kvm_free_stage2_pgd(&kvm->arch.mmu); + kvm_mmu_free_memory_cache(&kvm->arch.mmu.split_page_cache); } static void stage2_unmap_memslot(struct kvm *kvm, @@ -996,6 +1067,29 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, stage2_wp_range(&kvm->arch.mmu, start, end); } +/** + * kvm_mmu_split_memory_region() - split the stage 2 blocks into PAGE_SIZE + * pages for memory slot + * @kvm: The KVM pointer + * @slot: The memory slot to split + * + * Acquires kvm->mmu_lock. Called with kvm->slots_lock mutex acquired, + * serializing operations for VM memory regions. + */ +static void kvm_mmu_split_memory_region(struct kvm *kvm, int slot) +{ + struct kvm_memslots *slots = kvm_memslots(kvm); + struct kvm_memory_slot *memslot = id_to_memslot(slots, slot); + phys_addr_t start, end; + + start = memslot->base_gfn << PAGE_SHIFT; + end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT; + + write_lock(&kvm->mmu_lock); + kvm_mmu_split_huge_pages(kvm, start, end); + write_unlock(&kvm->mmu_lock); +} + /* * kvm_arch_mmu_enable_log_dirty_pt_masked - enable dirty logging for selected * dirty pages. @@ -1795,7 +1889,19 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, if (kvm_dirty_log_manual_protect_and_init_set(kvm)) return; + if (READ_ONCE(eager_page_split)) + kvm_mmu_split_memory_region(kvm, new->id); + kvm_mmu_wp_memory_region(kvm, new->id); + } else { + /* + * Free any leftovers from the eager page splitting cache. Do + * this when deleting, moving, disabling dirty logging, or + * creating the memslot (a nop). Doing it for deletes makes + * sure we don't leak memory, and there's no need to keep the + * cache around for any of the other cases. + */ + kvm_mmu_free_memory_cache(&kvm->arch.mmu.split_page_cache); } } From patchwork Sat Nov 12 08:17:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13041040 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30E06C433FE for ; Sat, 12 Nov 2022 08:17:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234827AbiKLIRv (ORCPT ); Sat, 12 Nov 2022 03:17:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234846AbiKLIRh (ORCPT ); Sat, 12 Nov 2022 03:17:37 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DC7E5CD0A for ; Sat, 12 Nov 2022 00:17:35 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id 36-20020a17090a0fa700b00213d5296e13so3696288pjz.6 for ; Sat, 12 Nov 2022 00:17:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0zGg5qpt/N/L+7x9jBKqpcMeXFwv5fWMYVI5N+PjKgU=; b=F/QDe6+1BvTXnyFLOg8gmQXYeT1rQN+sOHZw5/jn/OIBg/ksieVuSU4RzXZqbOs6hn aNj63YtnreOXtjSpmZLavBBDWoKmI03HxKlfoMz3hB9W602bO1TYehJR+fIsGRF2s/zZ kx9eXb8+bZHkcyoh3xmjRsCjZJC8lYCvIdEXGhaFPLUysIDdId6lqbqIa6Zgm85IFCSZ rApV/opjPoYkVYAQjpoVccRv/EF5McPBadihotqfKUFYIuStVJxwmOk8b2kA3I/gnIYT O1Y+ZrnCEx6xISo+PAodbia8LsljVtUpxMEZUQLiK+M7/xgL1P5nctTNvIAmgDM2yXjf Ttdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0zGg5qpt/N/L+7x9jBKqpcMeXFwv5fWMYVI5N+PjKgU=; b=waj+sdwJ4hYuT0bPdYVRa1rgM6e8uaPfeQFfMMyQI7I7lhEdM+va50iTcrQ/kX6ijw W/8ko9C9vGQBIG7HABDOx6uX7YLt5oJgpNJBibK1faza3+sW+TO4lViCGpqX6s0a9GQE X2dAqUihG5BdYhM6+TyvjIWAZXJoU7IOqLb5dRfIYv81WunA/zeBLVHTlkNoluI0Zl4C SCtLHV8dd0l4ROTTPk1gCLMewoChk8ts5a88jLGAnv1dW2/4ehYFMgO2UEeDCBl5zDd/ svooGNWsGb0aI0mpsSVJvvxgIlnkY4iHdmQxAmfifyX9dEbQBEhU+qRzZbe9OmRqCDHw yPfA== X-Gm-Message-State: ANoB5pkGOmm1EoW2DjUFaoL4sow/c3LQfaiHTKOEmEAPWB8CjhDrZTaL psRYlD4DEads44UrknX1skI06PCA3ERQCg== X-Google-Smtp-Source: AA0mqf4lNmUCe4aprIz7lgD/9R+sooPtJqmagrCa4wPvpU7RVYrs0Whp+W9dnK4MbP2UeTGIJf0U7NvtOtUO4g== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:aa7:8dcd:0:b0:56c:674a:16f0 with SMTP id j13-20020aa78dcd000000b0056c674a16f0mr6285637pfr.10.1668241054624; Sat, 12 Nov 2022 00:17:34 -0800 (PST) Date: Sat, 12 Nov 2022 08:17:12 +0000 In-Reply-To: <20221112081714.2169495-1-ricarkol@google.com> Mime-Version: 1.0 References: <20221112081714.2169495-1-ricarkol@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221112081714.2169495-11-ricarkol@google.com> Subject: [RFC PATCH 10/12] KVM: arm64: Open-code kvm_mmu_write_protect_pt_masked() From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, dmatlack@google.com, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Move the functionality of kvm_mmu_write_protect_pt_masked() into its caller, kvm_arch_mmu_enable_log_dirty_pt_masked(). This will be used in a subsequent commit in order to share some of the code in kvm_arch_mmu_enable_log_dirty_pt_masked(). No functional change intended. Signed-off-by: Ricardo Koller --- arch/arm64/kvm/mmu.c | 42 +++++++++++++++--------------------------- 1 file changed, 15 insertions(+), 27 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index f2753d9deb19..7881df411643 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1045,28 +1045,6 @@ static void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot) kvm_flush_remote_tlbs(kvm); } -/** - * kvm_mmu_write_protect_pt_masked() - write protect dirty pages - * @kvm: The KVM pointer - * @slot: The memory slot associated with mask - * @gfn_offset: The gfn offset in memory slot - * @mask: The mask of dirty pages at offset 'gfn_offset' in this memory - * slot to be write protected - * - * Walks bits set in mask write protects the associated pte's. Caller must - * acquire kvm_mmu_lock. - */ -static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, - struct kvm_memory_slot *slot, - gfn_t gfn_offset, unsigned long mask) -{ - phys_addr_t base_gfn = slot->base_gfn + gfn_offset; - phys_addr_t start = (base_gfn + __ffs(mask)) << PAGE_SHIFT; - phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT; - - stage2_wp_range(&kvm->arch.mmu, start, end); -} - /** * kvm_mmu_split_memory_region() - split the stage 2 blocks into PAGE_SIZE * pages for memory slot @@ -1091,17 +1069,27 @@ static void kvm_mmu_split_memory_region(struct kvm *kvm, int slot) } /* - * kvm_arch_mmu_enable_log_dirty_pt_masked - enable dirty logging for selected - * dirty pages. + * kvm_arch_mmu_enable_log_dirty_pt_masked() - enable dirty logging for selected pages. + * @kvm: The KVM pointer + * @slot: The memory slot associated with mask + * @gfn_offset: The gfn offset in memory slot + * @mask: The mask of pages at offset 'gfn_offset' in this memory + * slot to enable dirty logging on * - * It calls kvm_mmu_write_protect_pt_masked to write protect selected pages to - * enable dirty logging for them. + * Writes protect selected pages to enable dirty logging for them. Caller must + * acquire kvm->mmu_lock. */ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn_offset, unsigned long mask) { - kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask); + phys_addr_t base_gfn = slot->base_gfn + gfn_offset; + phys_addr_t start = (base_gfn + __ffs(mask)) << PAGE_SHIFT; + phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT; + + lockdep_assert_held_write(&kvm->mmu_lock); + + stage2_wp_range(&kvm->arch.mmu, start, end); } static void kvm_send_hwpoison_signal(unsigned long address, short lsb) From patchwork Sat Nov 12 08:17:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13041041 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEF08C4332F for ; Sat, 12 Nov 2022 08:18:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234810AbiKLISE (ORCPT ); Sat, 12 Nov 2022 03:18:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40366 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234876AbiKLIRt (ORCPT ); Sat, 12 Nov 2022 03:17:49 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BF5CA6036A for ; Sat, 12 Nov 2022 00:17:37 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-348608c1cd3so64045707b3.10 for ; Sat, 12 Nov 2022 00:17:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3ei5KTNcwhvdEV4DEpNFWL2FLBtddGkRWDZ/PZs/pyI=; b=a19I5M6QD8ehxktbvDkxtPw0UrpG81hnIry+JL2VyC+pmJlCLjUTQU9D3tS4gRNcCU qdp/5z/jxiSnVMaqHsIyg8amm+Nl2PQ+GVBxqPMbV7Y+4Qel7yqwPcuWWOFzFhQZIN0R GYroxUY6ffo0v/rEOhQml+75jUUl6lAOZiTa2QnBE68tXsAFz0+PBWKHa+BOGDaZ5IER a11qT3xsYQfdwNbi3qNoxzKIkwtdV1D+8O1INKn/UsojFZXoTVb7znNF/s0uoMPeg2o6 XqTzxzckjG64tlCy9itEM7wEL74IrzNod9jraA/KC4u6PP+bao/Vj3O8nlguA7aDZaUT 2FZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3ei5KTNcwhvdEV4DEpNFWL2FLBtddGkRWDZ/PZs/pyI=; b=V8H3OdfjxBzf/N+0fwBKqDH/BMtYUAeJ2M9Op8a6QNfoYd5/WmaY1x6FkwIKYbrag6 SO4Qga/8wrQK8sx2giwaXMjXpMd0V+USn6owwTkju/4iChPSdOZOHGCVawgM5ro+OXIm SwZZkTpSFsmuKeT/x4zMdTi9nNeTeR7aIsJ93w+ee+usiHiNjY3FvgoJ7e7UZQqO8o33 fa2fYJzs1/PIF4+thF0XGXcCbCjUylf9B/ibrouaD41dafmozVNvlxxglHusy+XhViwB rbqWMWOi2g0EwiEsX0NOfRUMTlsALWpzTUvh/PeemTRVmDuJA83azw3GIGHoNpfO1RH0 Ipww== X-Gm-Message-State: ANoB5pkzVwPPlNaKZjKwL+Pr2YgIYvYQv3Ll0jLY2CYSuM07n1Oxq/Bb BMIpcD9x/vBksyCcCN+U1A5m2T3EAts/AQ== X-Google-Smtp-Source: AA0mqf76MOJIlJMgyK8SACjurwBgdmgD8UMbYjNn4NyBxOTE9EDQqi80GPLESjM3J7HUUAm7YnPRl075qddTHg== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a25:50e:0:b0:6dd:657d:ffc8 with SMTP id 14-20020a25050e000000b006dd657dffc8mr3797725ybf.269.1668241056148; Sat, 12 Nov 2022 00:17:36 -0800 (PST) Date: Sat, 12 Nov 2022 08:17:13 +0000 In-Reply-To: <20221112081714.2169495-1-ricarkol@google.com> Mime-Version: 1.0 References: <20221112081714.2169495-1-ricarkol@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221112081714.2169495-12-ricarkol@google.com> Subject: [RFC PATCH 11/12] KVM: arm64: Split huge pages during KVM_CLEAR_DIRTY_LOG From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, dmatlack@google.com, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This is the arm64 counterpart of commit cb00a70bd4b7 ("KVM: x86/mmu: Split huge pages mapped by the TDP MMU during KVM_CLEAR_DIRTY_LOG"), which has the benefit of splitting the cost of splitting a memslot across multiple ioctls. Split huge pages on the range specified using KVM_CLEAR_DIRTY_LOG. And do not split when enabling dirty logging if KVM_DIRTY_LOG_INITIALLY_SET is set. Signed-off-by: Ricardo Koller --- arch/arm64/kvm/mmu.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 7881df411643..b8211d833cc1 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1076,8 +1076,8 @@ static void kvm_mmu_split_memory_region(struct kvm *kvm, int slot) * @mask: The mask of pages at offset 'gfn_offset' in this memory * slot to enable dirty logging on * - * Writes protect selected pages to enable dirty logging for them. Caller must - * acquire kvm->mmu_lock. + * Splits selected pages to PAGE_SIZE and then writes protect them to enable + * dirty logging for them. Caller must acquire kvm->mmu_lock. */ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, struct kvm_memory_slot *slot, @@ -1090,6 +1090,14 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, lockdep_assert_held_write(&kvm->mmu_lock); stage2_wp_range(&kvm->arch.mmu, start, end); + + /* + * If initially-all-set mode is not set, then huge-pages were already + * split when enabling dirty logging: no need to do it again. + */ + if (kvm_dirty_log_manual_protect_and_init_set(kvm) && + READ_ONCE(eager_page_split)) + kvm_mmu_split_huge_pages(kvm, start, end); } static void kvm_send_hwpoison_signal(unsigned long address, short lsb) @@ -1474,10 +1482,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, */ if (fault_status == FSC_PERM && vma_pagesize == fault_granule) ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot); - else + else { ret = kvm_pgtable_stage2_map(pgt, fault_ipa, vma_pagesize, __pfn_to_phys(pfn), prot, memcache, KVM_PGTABLE_WALK_SHARED); + } /* Mark the page dirty only if the fault is handled successfully */ if (writable && !ret) { @@ -1887,7 +1896,9 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, * this when deleting, moving, disabling dirty logging, or * creating the memslot (a nop). Doing it for deletes makes * sure we don't leak memory, and there's no need to keep the - * cache around for any of the other cases. + * cache around for any of the other cases. Keeping the cache + * is useful for succesive KVM_CLEAR_DIRTY_LOG calls, which is + * not handled in this function. */ kvm_mmu_free_memory_cache(&kvm->arch.mmu.split_page_cache); } From patchwork Sat Nov 12 08:17:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13041042 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D52AC433FE for ; Sat, 12 Nov 2022 08:18:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234928AbiKLISK (ORCPT ); Sat, 12 Nov 2022 03:18:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40430 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234883AbiKLIR5 (ORCPT ); Sat, 12 Nov 2022 03:17:57 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73F0B60EA8 for ; Sat, 12 Nov 2022 00:17:38 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id i8-20020a170902c94800b0018712ccd6bbso5042437pla.1 for ; Sat, 12 Nov 2022 00:17:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=as/2ydjmznvnqW0FloxcvXboHzV8IbyszOyvBp/KQys=; b=f/ISQTh0HiQX6r+4v/q5aqiowpKz0+27Za0micS2DOH6sllgRx0Kgeo9gu8fO5N4aA zqvPWUTQDkk9e7UPcJR2wXAGmeUJsPBqizQQD1cH1HsrwmiYcVcRmP4ZG5A5Qt6QKJPd iXzuEFPduKPVjomG4IA/pSeZ/BWwDFpxX7kWuT/+4gzVH0o6Z9LXCmfJuRMnP/Mnfnmz FnfwZZr5MtkD0AAMjunxNO8AysKUILqbed0FsdoErIubA7InSb9PVF61b8Vb7aKpZUdZ jiFL5+EBYszsm+1wjNA82c7C4yjdenAgFmc+9xlKCIELuelQUqq/cuWa/ZrcOoWhF0mZ 2wvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=as/2ydjmznvnqW0FloxcvXboHzV8IbyszOyvBp/KQys=; b=LO5MKE3sVyRD9rYJT8ImHaIkJuRfhfiMbYiH22xOTk7xH8qEhry57V6m7IYvnN8Hd4 7I51KKs4Lb7gvllCNE9L5LzNU+oO/lCn393+MHfP9BanVw6/rmmZbH2JyiquQlhAjR88 DXXlrZgMka40Fsil+iurS+2FPyoCDHLaD71ekNRyRN5VePFi22aVPPYeJQ2FN4lM9s4q lOoeATC10uRRTqcUEReAC4wPeVaqEooDZ18/RF+dxlOmnqkullcw5ixCS4BOJaYXtdUh SvCihojDWcEGkGFYSf0A0GVw6QYMEZF1sr/hwH1FTI+yzqLZUno//FMhRk2oAU/+z62q NjUg== X-Gm-Message-State: ANoB5pm3evIY+xHLuaQT88SpB/Qkfqe8QIMCp7+rHhqe4xU6i5repPbW DXAMDWNuCJUxF80NEUBxf0NowW1wZA5LLg== X-Google-Smtp-Source: AA0mqf6Ek0e3YKat6f7tj1La0x0rF/DxDBexU17GfPSQWBrUEIK5BDbr6YCfl58li1dsPZEpxH79k8ZyyvfX5Q== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a62:e412:0:b0:56d:a1fc:7000 with SMTP id r18-20020a62e412000000b0056da1fc7000mr6021367pfh.35.1668241057925; Sat, 12 Nov 2022 00:17:37 -0800 (PST) Date: Sat, 12 Nov 2022 08:17:14 +0000 In-Reply-To: <20221112081714.2169495-1-ricarkol@google.com> Mime-Version: 1.0 References: <20221112081714.2169495-1-ricarkol@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221112081714.2169495-13-ricarkol@google.com> Subject: [RFC PATCH 12/12] KVM: arm64: Use local TLBI on permission relaxation From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, dmatlack@google.com, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Marc Zyngier Broadcasted TLB invalidations (TLBI) are usually less performant than their local variant. In particular, we observed some implementations that take millliseconds to complete parallel broadcasted TLBIs. It's safe to use local, non-shareable, TLBIs when relaxing permissions on a PTE in the KVM case for a couple of reasons. First, according to the ARM Arm (DDI 0487H.a D5-4913), permission relaxation does not need break-before-make. Second, KVM does not set the VTTBR_EL2.CnP bit, so each PE has its own TLB entry for the same page. KVM could tolerate that when doing permission relaxation (i.e., not having changes broadcasted to all PEs). Signed-off-by: Marc Zyngier Signed-off-by: Ricardo Koller --- arch/arm64/include/asm/kvm_asm.h | 4 +++ arch/arm64/kvm/hyp/nvhe/hyp-main.c | 10 ++++++ arch/arm64/kvm/hyp/nvhe/tlb.c | 54 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/pgtable.c | 2 +- arch/arm64/kvm/hyp/vhe/tlb.c | 32 ++++++++++++++++++ 5 files changed, 101 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 43c3bc0f9544..bb17b2ead4c7 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -68,6 +68,7 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___kvm_vcpu_run, __KVM_HOST_SMCCC_FUNC___kvm_flush_vm_context, __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid_ipa, + __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid_ipa_nsh, __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid, __KVM_HOST_SMCCC_FUNC___kvm_flush_cpu_context, __KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntvoff, @@ -225,6 +226,9 @@ extern void __kvm_flush_vm_context(void); extern void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu); extern void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa, int level); +extern void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu, + phys_addr_t ipa, + int level); extern void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu); extern void __kvm_timer_set_cntvoff(u64 cntvoff); diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index 728e01d4536b..c6bf1e49ca93 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -125,6 +125,15 @@ static void handle___kvm_tlb_flush_vmid_ipa(struct kvm_cpu_context *host_ctxt) __kvm_tlb_flush_vmid_ipa(kern_hyp_va(mmu), ipa, level); } +static void handle___kvm_tlb_flush_vmid_ipa_nsh(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(struct kvm_s2_mmu *, mmu, host_ctxt, 1); + DECLARE_REG(phys_addr_t, ipa, host_ctxt, 2); + DECLARE_REG(int, level, host_ctxt, 3); + + __kvm_tlb_flush_vmid_ipa_nsh(kern_hyp_va(mmu), ipa, level); +} + static void handle___kvm_tlb_flush_vmid(struct kvm_cpu_context *host_ctxt) { DECLARE_REG(struct kvm_s2_mmu *, mmu, host_ctxt, 1); @@ -315,6 +324,7 @@ static const hcall_t host_hcall[] = { HANDLE_FUNC(__kvm_vcpu_run), HANDLE_FUNC(__kvm_flush_vm_context), HANDLE_FUNC(__kvm_tlb_flush_vmid_ipa), + HANDLE_FUNC(__kvm_tlb_flush_vmid_ipa_nsh), HANDLE_FUNC(__kvm_tlb_flush_vmid), HANDLE_FUNC(__kvm_flush_cpu_context), HANDLE_FUNC(__kvm_timer_set_cntvoff), diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c index d296d617f589..ef2b70587f93 100644 --- a/arch/arm64/kvm/hyp/nvhe/tlb.c +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c @@ -109,6 +109,60 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, __tlb_switch_to_host(&cxt); } +void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu, + phys_addr_t ipa, int level) +{ + struct tlb_inv_context cxt; + + dsb(nshst); + + /* Switch to requested VMID */ + __tlb_switch_to_guest(mmu, &cxt); + + /* + * We could do so much better if we had the VA as well. + * Instead, we invalidate Stage-2 for this IPA, and the + * whole of Stage-1. Weep... + */ + ipa >>= 12; + __tlbi_level(ipas2e1, ipa, level); + + /* + * We have to ensure completion of the invalidation at Stage-2, + * since a table walk on another CPU could refill a TLB with a + * complete (S1 + S2) walk based on the old Stage-2 mapping if + * the Stage-1 invalidation happened first. + */ + dsb(nsh); + __tlbi(vmalle1); + dsb(nsh); + isb(); + + /* + * If the host is running at EL1 and we have a VPIPT I-cache, + * then we must perform I-cache maintenance at EL2 in order for + * it to have an effect on the guest. Since the guest cannot hit + * I-cache lines allocated with a different VMID, we don't need + * to worry about junk out of guest reset (we nuke the I-cache on + * VMID rollover), but we do need to be careful when remapping + * executable pages for the same guest. This can happen when KSM + * takes a CoW fault on an executable page, copies the page into + * a page that was previously mapped in the guest and then needs + * to invalidate the guest view of the I-cache for that page + * from EL1. To solve this, we invalidate the entire I-cache when + * unmapping a page from a guest if we have a VPIPT I-cache but + * the host is running at EL1. As above, we could do better if + * we had the VA. + * + * The moral of this story is: if you have a VPIPT I-cache, then + * you should be running with VHE enabled. + */ + if (icache_is_vpipt()) + icache_inval_all_pou(); + + __tlb_switch_to_host(&cxt); +} + void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) { struct tlb_inv_context cxt; diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 36b81df5687e..4f8b610316ed 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1140,7 +1140,7 @@ int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr, ret = stage2_update_leaf_attrs(pgt, addr, 1, set, clr, NULL, &level, KVM_PGTABLE_WALK_SHARED); if (!ret) - kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, pgt->mmu, addr, level); + kvm_call_hyp(__kvm_tlb_flush_vmid_ipa_nsh, pgt->mmu, addr, level); return ret; } diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c index 24cef9b87f9e..e69da550cdc5 100644 --- a/arch/arm64/kvm/hyp/vhe/tlb.c +++ b/arch/arm64/kvm/hyp/vhe/tlb.c @@ -111,6 +111,38 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, __tlb_switch_to_host(&cxt); } +void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu, + phys_addr_t ipa, int level) +{ + struct tlb_inv_context cxt; + + dsb(nshst); + + /* Switch to requested VMID */ + __tlb_switch_to_guest(mmu, &cxt); + + /* + * We could do so much better if we had the VA as well. + * Instead, we invalidate Stage-2 for this IPA, and the + * whole of Stage-1. Weep... + */ + ipa >>= 12; + __tlbi_level(ipas2e1, ipa, level); + + /* + * We have to ensure completion of the invalidation at Stage-2, + * since a table walk on another CPU could refill a TLB with a + * complete (S1 + S2) walk based on the old Stage-2 mapping if + * the Stage-1 invalidation happened first. + */ + dsb(nsh); + __tlbi(vmalle1); + dsb(nsh); + isb(); + + __tlb_switch_to_host(&cxt); +} + void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) { struct tlb_inv_context cxt;