From patchwork Wed Sep 20 18:17:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13393162 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 15E1DC04FD1 for ; Wed, 20 Sep 2023 18:18:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=KhWSm2BjVXals9uGR4fE0KLeGDVZOPQJ3zBQqb2SOpg=; b=2nkQAjLtL4UmcR FFLAwaCR3VwfsCEIEhNi/8vE6an7m6bo1BzdWwGb/m0M85MOJZ7ok3+7lX1sXi3xnygAi2xA3luve avj1x9fEad9linlrTRnZX92xo39lUVPj4hdckYJ0AEmgKW/79eOzjwVTQebhKJoWMViZDAHnMaKYG niN+ScNdIauep/ETC1QrNSQvZChsWeTmpw2gxCJVgypRLmoc7VVHEnOkd0DhbSNVQutsK6VOxUMrR kDtUvMwRVLWEnKX5GQbFd9uOMgSxQFNdAKQIXYW3MVUFxaDNIScH89j50wqtw/64KRZJEh4vst4GY 3wxKuhkX9U6iQYVXXEvQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qj1m0-003pwv-30; Wed, 20 Sep 2023 18:18:05 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qj1le-003pmo-2M for linux-arm-kernel@lists.infradead.org; Wed, 20 Sep 2023 18:17:45 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id F02B561DBD; Wed, 20 Sep 2023 18:17:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C210DC43395; Wed, 20 Sep 2023 18:17:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1695233860; bh=L3P7HNXuwH2qYd0d9de5CqkeyTb7Et6cqFHWOzXUnnA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=d1L6yCoq4ULQTx+jEq+SJ/Nri/Y3zEUI43tG/ipXxbUHEtPdl302HxXzmzCHFM9WT IGFfm+n/2IImYe3ObKOfmeG8ayb9MOrPJwdvu7BWHVs4kjTcy38lZgAkDb8Lbe9ctt ezR8+SstYN+DnBPIxtZXHWFNGwn6n/swMX0u1PwK6PlvBApF19e2K8UH+JXLeTsgfz TBygILhmUQyzCodUCDpeqfHYkhytGqgTCe/jVjD8CLxj10wsOQ2OklkkPV+0/exudp cmmLjd3m1bnjS0WhBzhASFybLaSwa3oPDguziwmn+sv09NLNyICamdgaM/89eiseZQ PEKaXcyp/qwwg== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1qj1la-00Ejx0-M2; Wed, 20 Sep 2023 19:17:38 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org Cc: James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu , Joey Gouly , Shameerali Kolothum Thodi , Xu Zhao , Eric Auger Subject: [PATCH v2 08/11] KVM: arm64: Build MPIDR to vcpu index cache at runtime Date: Wed, 20 Sep 2023 19:17:28 +0100 Message-Id: <20230920181731.2232453-9-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230920181731.2232453-1-maz@kernel.org> References: <20230920181731.2232453-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com, joey.gouly@arm.com, shameerali.kolothum.thodi@huawei.com, zhaoxu.35@bytedance.com, eric.auger@redhat.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230920_111742_885862_1DAB10D3 X-CRM114-Status: GOOD ( 24.70 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The MPIDR_EL1 register contains a unique value that identifies the CPU. The only problem with it is that it is stupidly large (32 bits, once the useless stuff is removed). Trying to obtain a vcpu from an MPIDR value is a fairly common, yet costly operation: we iterate over all the vcpus until we find the correct one. While this is cheap for small VMs, it is pretty expensive on large ones, specially if you are trying to get to the one that's at the end of the list... In order to help with this, it is important to realise that the MPIDR values are actually structured, and that implementations tend to use a small number of significant bits in the 32bit space. We can use this fact to our advantage by computing a small hash table that uses the "compression" of the significant MPIDR bits as an index, giving us the vcpu index as a result. Given that the MPIDR values can be supplied by userspace, and that an evil VMM could decide to make *all* bits significant, resulting in a 4G-entry table, we only use this method if the resulting table fits in a single page. Otherwise, we fallback to the good old iterative method. Nothing uses that table just yet, but keep your eyes peeled. Reviewed-by: Joey Gouly Tested-by: Joey Gouly Tested-by: Shameer Kolothum Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_host.h | 28 ++++++++++++++++ arch/arm64/kvm/arm.c | 54 +++++++++++++++++++++++++++++++ 2 files changed, 82 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index af06ccb7ee34..4b5a57a60b6a 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -202,6 +202,31 @@ struct kvm_protected_vm { struct kvm_hyp_memcache teardown_mc; }; +struct kvm_mpidr_data { + u64 mpidr_mask; + DECLARE_FLEX_ARRAY(u16, cmpidr_to_idx); +}; + +static inline u16 kvm_mpidr_index(struct kvm_mpidr_data *data, u64 mpidr) +{ + unsigned long mask = data->mpidr_mask; + u64 aff = mpidr & MPIDR_HWID_BITMASK; + int nbits, bit, bit_idx = 0; + u16 index = 0; + + /* + * If this looks like RISC-V's BEXT or x86's PEXT + * instructions, it isn't by accident. + */ + nbits = fls(mask); + for_each_set_bit(bit, &mask, nbits) { + index |= (aff & BIT(bit)) >> (bit - bit_idx); + bit_idx++; + } + + return index; +} + struct kvm_arch { struct kvm_s2_mmu mmu; @@ -248,6 +273,9 @@ struct kvm_arch { /* VM-wide vCPU feature set */ DECLARE_BITMAP(vcpu_features, KVM_VCPU_MAX_FEATURES); + /* MPIDR to vcpu index mapping, optional */ + struct kvm_mpidr_data *mpidr_data; + /* * VM-wide PMU filter, implemented as a bitmap and big enough for * up to 2^10 events (ARMv8.0) or 2^16 events (ARMv8.1+). diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 23c22dbd1969..d9de4d3a339c 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -205,6 +205,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm) if (is_protected_kvm_enabled()) pkvm_destroy_hyp_vm(kvm); + kfree(kvm->arch.mpidr_data); kvm_destroy_vcpus(kvm); kvm_unshare_hyp(kvm, kvm + 1); @@ -578,6 +579,57 @@ static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu) return vcpu_get_flag(vcpu, VCPU_INITIALIZED); } +static void kvm_init_mpidr_data(struct kvm *kvm) +{ + struct kvm_mpidr_data *data = NULL; + unsigned long c, mask, nr_entries; + u64 aff_set = 0, aff_clr = ~0UL; + struct kvm_vcpu *vcpu; + + mutex_lock(&kvm->arch.config_lock); + + if (kvm->arch.mpidr_data || atomic_read(&kvm->online_vcpus) == 1) + goto out; + + kvm_for_each_vcpu(c, vcpu, kvm) { + u64 aff = kvm_vcpu_get_mpidr_aff(vcpu); + aff_set |= aff; + aff_clr &= aff; + } + + /* + * A significant bit can be either 0 or 1, and will only appear in + * aff_set. Use aff_clr to weed out the useless stuff. + */ + mask = aff_set ^ aff_clr; + nr_entries = BIT_ULL(hweight_long(mask)); + + /* + * Don't let userspace fool us. If we need more than a single page + * to describe the compressed MPIDR array, just fall back to the + * iterative method. Single vcpu VMs do not need this either. + */ + if (struct_size(data, cmpidr_to_idx, nr_entries) <= PAGE_SIZE) + data = kzalloc(struct_size(data, cmpidr_to_idx, nr_entries), + GFP_KERNEL_ACCOUNT); + + if (!data) + goto out; + + data->mpidr_mask = mask; + + kvm_for_each_vcpu(c, vcpu, kvm) { + u64 aff = kvm_vcpu_get_mpidr_aff(vcpu); + u16 index = kvm_mpidr_index(data, aff); + + data->cmpidr_to_idx[index] = c; + } + + kvm->arch.mpidr_data = data; +out: + mutex_unlock(&kvm->arch.config_lock); +} + /* * Handle both the initialisation that is being done when the vcpu is * run for the first time, as well as the updates that must be @@ -601,6 +653,8 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) if (likely(vcpu_has_run_once(vcpu))) return 0; + kvm_init_mpidr_data(kvm); + kvm_arm_vcpu_init_debug(vcpu); if (likely(irqchip_in_kernel(kvm))) {