From patchwork Tue Jun 25 19:07:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Cooper X-Patchwork-Id: 13711846 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5A380C2BBCA for ; Tue, 25 Jun 2024 19:07:43 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.748024.1155580 (Exim 4.92) (envelope-from ) id 1sMBVo-0000Oq-94; Tue, 25 Jun 2024 19:07:28 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 748024.1155580; Tue, 25 Jun 2024 19:07:28 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sMBVo-0000Oj-5Y; Tue, 25 Jun 2024 19:07:28 +0000 Received: by outflank-mailman (input) for mailman id 748024; Tue, 25 Jun 2024 19:07:26 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sMBVm-0000O1-RK for xen-devel@lists.xenproject.org; Tue, 25 Jun 2024 19:07:26 +0000 Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [2a00:1450:4864:20::631]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 278146f7-3326-11ef-b4bb-af5377834399; Tue, 25 Jun 2024 21:07:24 +0200 (CEST) Received: by mail-ej1-x631.google.com with SMTP id a640c23a62f3a-a724b9b34b0so339742966b.1 for ; Tue, 25 Jun 2024 12:07:24 -0700 (PDT) Received: from andrewcoop.eng.citrite.net ([160.101.139.1]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a725d7b190fsm180434766b.50.2024.06.25.12.07.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jun 2024 12:07:21 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 278146f7-3326-11ef-b4bb-af5377834399 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1719342442; x=1719947242; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xi3l57Ho1pUmbNMNe9NHaEbiwxfjL48pmKM7zQN1k2o=; b=lYdTahKGt635gTcgZBqAPehk9FdZXCcADV/lGlyDj9lzHWm+4JTmdGtG2g3F+SKQS9 nNeDACwd8HGF3HyvYCVqOsSfhYVppa+l/cJdN2BElc+VR2hyQ0g82+kksHFNXnZ6yeYj Rf1hpcb/WaXifpwLdnj1H8xfZaO/EHG7P1G+s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719342442; x=1719947242; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xi3l57Ho1pUmbNMNe9NHaEbiwxfjL48pmKM7zQN1k2o=; b=Cra9gL8fa6xP0muFLvxUyrH7RcCNgKcry8Y6zXbApulnz6zSJQyBLlgncAMLHiCA7f SsDpj3TJkOuLUFb3OE8vtFmKlLiUvAzT5OxPgivZUY9jkJqlhkiOfE+YqmYja6TDHkn6 8xhYSCyQRPjIDE2XVH2peB0PD7QfQwzaNl1RVFVDb/SqZZqreMVgTYCtM+0yxl5ptXLc w16lejgz0tV6ZA0bFR2iE4maRgKRlUzDZTC9nu2o+bAYb8FfZ6wkb4IPkKSjR3guwams vLIVEQEav4IyS9fgVM1o4F5/LqDnMvLEJGpc9FMQNyGjdOmrln+RuiwnXw3D6nhJmZAy eZTA== X-Gm-Message-State: AOJu0YzbJgjRIZMyY9hzbJKN7ouH/hL12wyELhVHf37RE225xqyacFff BYMcpRbpZ2hGU9c+NaMVN2p/5hbEloj39mRv7+BfpP4g3kWtbyu497a7azrmG2lYYoGNzSN4wE4 4ubU= X-Google-Smtp-Source: AGHT+IEd3SHDA5f778vCgNNEnjYw+yhpd1jw+f+ZXJu1RKSDoaW7l41vRWxVF3RD3nt1F64w90ZD6Q== X-Received: by 2002:a17:907:c815:b0:a72:5226:3307 with SMTP id a640c23a62f3a-a7252263ff9mr539077266b.57.1719342442485; Tue, 25 Jun 2024 12:07:22 -0700 (PDT) From: Andrew Cooper To: Xen-devel Cc: Andrew Cooper , Jan Beulich , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Stefano Stabellini , Julien Grall , Volodymyr Babchuk , Bertrand Marquis , Michal Orzel , Oleksii Kurochko Subject: [PATCH 1/6] x86/vmx: Rewrite vmx_sync_pir_to_irr() to be more efficient Date: Tue, 25 Jun 2024 20:07:14 +0100 Message-Id: <20240625190719.788643-2-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240625190719.788643-1-andrew.cooper3@citrix.com> References: <20240625190719.788643-1-andrew.cooper3@citrix.com> MIME-Version: 1.0 There are two issues. First, pi_test_and_clear_on() pulls the cache-line to the CPU and dirties it even if there's nothing outstanding, but the final for_each_set_bit() is O(256) when O(8) would do, and would avoid multiple atomic updates to the same IRR word. Rewrite it from scratch, explaining what's going on at each step. Bloat-o-meter reports 177 -> 145 (net -32), but the better aspect is the removal calls to __find_{first,next}_bit() hidden behind for_each_set_bit(). No functional change. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Roger Pau Monné CC: Stefano Stabellini CC: Julien Grall CC: Volodymyr Babchuk CC: Bertrand Marquis CC: Michal Orzel CC: Oleksii Kurochko The main purpose of this is to get rid of for_each_set_bit(). Full side-by-side diff: https://termbin.com/5wsb The first loop ends up being unrolled identically, although there's 64bit movs to reload 0 for the xchg which is definitely suboptimal. Opencoding asm ("xchg") without a memory clobber gets to 32bit movs which is an improvement but no ideal. However I didn't fancy going that far. Also, the entirety of pi_desc is embedded in struct vcpu, which means when we're executing in Xen, the prefetcher is going to be stealing it back from the IOMMU all the time. This is a datastructure which really should *not* be adjacent to all the other misc data in the vcpu. --- xen/arch/x86/hvm/vmx/vmx.c | 61 +++++++++++++++++++++++++++++++++----- 1 file changed, 53 insertions(+), 8 deletions(-) diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index f16faa6a615c..948ad48a4757 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -2321,18 +2321,63 @@ static void cf_check vmx_deliver_posted_intr(struct vcpu *v, u8 vector) static void cf_check vmx_sync_pir_to_irr(struct vcpu *v) { - struct vlapic *vlapic = vcpu_vlapic(v); - unsigned int group, i; - DECLARE_BITMAP(pending_intr, X86_NR_VECTORS); + struct pi_desc *desc = &v->arch.hvm.vmx.pi_desc; + union { + uint64_t _64[X86_NR_VECTORS / (sizeof(uint64_t) * 8)]; + uint32_t _32[X86_NR_VECTORS / (sizeof(uint32_t) * 8)]; + } vec; + uint32_t *irr; + bool on; - if ( !pi_test_and_clear_on(&v->arch.hvm.vmx.pi_desc) ) + /* + * The PIR is a contended cacheline which bounces between the CPU and + * IOMMU. The IOMMU updates the entire PIR atomically, but we can't + * express the same on the CPU side, so care has to be taken. + * + * First, do a plain read of ON. If the PIR hasn't been modified, this + * will keep the cacheline Shared and not pull it Excusive on the CPU. + */ + if ( !pi_test_on(desc) ) return; - for ( group = 0; group < ARRAY_SIZE(pending_intr); group++ ) - pending_intr[group] = pi_get_pir(&v->arch.hvm.vmx.pi_desc, group); + /* + * Second, if the plain read said that ON was set, we must clear it with + * an atomic action. This will bring the cachline to Exclusive on the + * CPU. + * + * This should always succeed because noone else should be playing with + * the PIR behind our back, but assert so just in case. + */ + on = pi_test_and_clear_on(desc); + ASSERT(on); - for_each_set_bit(i, pending_intr, X86_NR_VECTORS) - vlapic_set_vector(i, &vlapic->regs->data[APIC_IRR]); + /* + * The cacheline is now Exclusive on the CPU, and the IOMMU has indicated + * (via ON being set) thatat least one vector is pending too. Atomically + * read and clear the entire pending bitmap as fast as we, to reduce the + * window that the IOMMU may steal the cacheline back from us. + * + * It is a performance concern, but not a correctness concern. If the + * IOMMU does steal the cacheline back, we'll just wait to get it back + * again. + */ + for ( unsigned int i = 0; i < ARRAY_SIZE(vec._64); ++i ) + vec._64[i] = xchg(&desc->pir[i], 0); + + /* + * Finally, merge the pending vectors into IRR. The IRR register is + * scattered in memory, so we have to do this 32 bits at a time. + */ + irr = (uint32_t *)&vcpu_vlapic(v)->regs->data[APIC_IRR]; + for ( unsigned int i = 0; i < ARRAY_SIZE(vec._32); ++i ) + { + if ( !vec._32[i] ) + continue; + + asm ( "lock or %[val], %[irr]" + : [irr] "+m" (irr[i * 0x10]) + : [val] "r" (vec._32[i]) ); + } } static bool cf_check vmx_test_pir(const struct vcpu *v, uint8_t vec) From patchwork Tue Jun 25 19:07:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Cooper X-Patchwork-Id: 13711847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CC523C30659 for ; Tue, 25 Jun 2024 19:07:43 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.748025.1155586 (Exim 4.92) (envelope-from ) id 1sMBVo-0000SB-J8; Tue, 25 Jun 2024 19:07:28 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 748025.1155586; Tue, 25 Jun 2024 19:07:28 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sMBVo-0000RN-Dn; Tue, 25 Jun 2024 19:07:28 +0000 Received: by outflank-mailman (input) for mailman id 748025; Tue, 25 Jun 2024 19:07:27 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sMBVn-0000O1-6k for xen-devel@lists.xenproject.org; Tue, 25 Jun 2024 19:07:27 +0000 Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [2a00:1450:4864:20::636]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 2852989e-3326-11ef-b4bb-af5377834399; Tue, 25 Jun 2024 21:07:25 +0200 (CEST) Received: by mail-ej1-x636.google.com with SMTP id a640c23a62f3a-a724440f597so410896266b.0 for ; Tue, 25 Jun 2024 12:07:25 -0700 (PDT) Received: from andrewcoop.eng.citrite.net ([160.101.139.1]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a725d7b190fsm180434766b.50.2024.06.25.12.07.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jun 2024 12:07:22 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2852989e-3326-11ef-b4bb-af5377834399 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1719342444; x=1719947244; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Zh6GzkNpe3HYBwE75qQqYBx6PpXtyY7khMvFg+0XxIs=; b=gN058rqBlo4/Y6K7IwrubFPJtGl/WgjrGNfsYmfRfb8pKRAavOD6++4PQgmPso68cE jWuIiySHlqaTx3gZFWCZ8Wu82ISmb61ITVX5vCiHCnouUJq3gg8Em4hcjdEZimlZOv6A Q5BfJFtX45wFC0ShvzWReUkahXwwz1jfJY/gk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719342444; x=1719947244; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Zh6GzkNpe3HYBwE75qQqYBx6PpXtyY7khMvFg+0XxIs=; b=W9EF4G07Ql0MrdIB0WUgs6VcI6+tJ1PN/NrKNsTCDTZANvJbEswW8UjXuEUnriMHOW kLERTULvRGOcp4emXoz1LwUCW0sKe6MXdFBdWs3ERavKJVxE4xNeoyg3F/RSIBADuSlt e51gKm/l9jAwlK6amGAV/zWmSI02Fm4bJtwht8Mhzbt0vHB5BwyJA/Q8c0KL74U4JjBF Vg/7IdDIBhlicd6PIl6yfkV7OkGQGn9WRJ+fQOvlJVwevTEQiNe0AdLsrUAg7htOVMRn g4M9T+xEUAiygbFD5NtLzpWyD5VWLTigdQbGWBwSjaInD8tFVkBXkC1QBiopp48Aaspd hI/w== X-Gm-Message-State: AOJu0Yx++9vFgxVUlITtLJdndQUnzT/1GUABGGcd5kxai61d1Facawj3 knGUtCYqkdTh4606/cGKO+LwbuqcXeyILS1sKSGQ2jbEjPh3LNjqZGyYSprrjKv4T23gCrtsxoG JREQ= X-Google-Smtp-Source: AGHT+IHNPmsdQcA9qz63VN59BMA3NWhmQaolVchErbzJapICKkMXgA60cQbG04cCvcrzQuppyr0mEA== X-Received: by 2002:a17:906:11c7:b0:a6f:4a42:1976 with SMTP id a640c23a62f3a-a7245bacda1mr454843466b.37.1719342443907; Tue, 25 Jun 2024 12:07:23 -0700 (PDT) From: Andrew Cooper To: Xen-devel Cc: Andrew Cooper , Jan Beulich , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Stefano Stabellini , Julien Grall , Volodymyr Babchuk , Bertrand Marquis , Michal Orzel , Oleksii Kurochko Subject: [PATCH 2/6] xen/bitops: Rename for_each_set_bit() to bitmap_for_each() Date: Tue, 25 Jun 2024 20:07:15 +0100 Message-Id: <20240625190719.788643-3-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240625190719.788643-1-andrew.cooper3@citrix.com> References: <20240625190719.788643-1-andrew.cooper3@citrix.com> MIME-Version: 1.0 The current implementation wants to take an in-memory bitmap. However, all ARM callers and all-but-1 x86 callers spill a scalar to the stack in order to use the "generic arbitrary bitmap" helpers under the hood. This functions, but it's far from ideal. Rename the construct and move it into bitmap.h, because having an iterator for an arbitrary bitmap is a useful thing. This will allow us to re-implement for_each_set_bit() to be more appropriate for scalar values. No functional change. Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich Acked-by: Michal Orzel --- CC: Jan Beulich CC: Roger Pau Monné CC: Stefano Stabellini CC: Julien Grall CC: Volodymyr Babchuk CC: Bertrand Marquis CC: Michal Orzel CC: Oleksii Kurochko --- xen/arch/arm/gic-vgic.c | 2 +- xen/arch/arm/vgic.c | 6 +++--- xen/arch/arm/vgic/vgic-mmio-v2.c | 2 +- xen/arch/arm/vgic/vgic-mmio.c | 12 ++++++------ xen/arch/x86/cpu-policy.c | 8 ++++---- xen/arch/x86/xstate.c | 4 ++-- xen/include/xen/bitmap.h | 12 ++++++++++++ xen/include/xen/bitops.h | 11 ----------- 8 files changed, 29 insertions(+), 28 deletions(-) diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c index b99e28722425..0dfff76a238e 100644 --- a/xen/arch/arm/gic-vgic.c +++ b/xen/arch/arm/gic-vgic.c @@ -111,7 +111,7 @@ static unsigned int gic_find_unused_lr(struct vcpu *v, { unsigned int used_lr; - for_each_set_bit(used_lr, lr_mask, nr_lrs) + bitmap_for_each(used_lr, lr_mask, nr_lrs) { gic_hw_ops->read_lr(used_lr, &lr_val); if ( lr_val.virq == p->irq ) diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index c04fc4f83f96..57519e834d78 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -429,7 +429,7 @@ void vgic_set_irqs_pending(struct vcpu *v, uint32_t r, unsigned int rank) /* LPIs will never be set pending via this function */ ASSERT(!is_lpi(32 * rank + 31)); - for_each_set_bit( i, &mask, 32 ) + bitmap_for_each( i, &mask, 32 ) { unsigned int irq = i + 32 * rank; @@ -483,7 +483,7 @@ bool vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode, perfc_incr(vgic_sgi_list); base = target->aff1 << 4; bitmap = target->list; - for_each_set_bit( i, &bitmap, sizeof(target->list) * 8 ) + bitmap_for_each( i, &bitmap, sizeof(target->list) * 8 ) { vcpuid = base + i; if ( vcpuid >= d->max_vcpus || d->vcpu[vcpuid] == NULL || @@ -728,7 +728,7 @@ void vgic_check_inflight_irqs_pending(struct domain *d, struct vcpu *v, const unsigned long mask = r; unsigned int i; - for_each_set_bit( i, &mask, 32 ) + bitmap_for_each( i, &mask, 32 ) { struct pending_irq *p; struct vcpu *v_target; diff --git a/xen/arch/arm/vgic/vgic-mmio-v2.c b/xen/arch/arm/vgic/vgic-mmio-v2.c index 2e507b10fed5..82d0c22b39fc 100644 --- a/xen/arch/arm/vgic/vgic-mmio-v2.c +++ b/xen/arch/arm/vgic/vgic-mmio-v2.c @@ -108,7 +108,7 @@ static void vgic_mmio_write_sgir(struct vcpu *source_vcpu, return; } - for_each_set_bit( vcpu_id, &targets, 8 ) + bitmap_for_each( vcpu_id, &targets, 8 ) { struct vcpu *vcpu = d->vcpu[vcpu_id]; struct vgic_irq *irq = vgic_get_irq(d, vcpu, intid); diff --git a/xen/arch/arm/vgic/vgic-mmio.c b/xen/arch/arm/vgic/vgic-mmio.c index 5d935a73013e..b023ecc20066 100644 --- a/xen/arch/arm/vgic/vgic-mmio.c +++ b/xen/arch/arm/vgic/vgic-mmio.c @@ -71,7 +71,7 @@ void vgic_mmio_write_senable(struct vcpu *vcpu, uint32_t intid = VGIC_ADDR_TO_INTID(addr, 1); unsigned int i; - for_each_set_bit( i, &val, len * 8 ) + bitmap_for_each( i, &val, len * 8 ) { struct vgic_irq *irq = vgic_get_irq(vcpu->domain, vcpu, intid + i); unsigned long flags; @@ -116,7 +116,7 @@ void vgic_mmio_write_cenable(struct vcpu *vcpu, uint32_t intid = VGIC_ADDR_TO_INTID(addr, 1); unsigned int i; - for_each_set_bit( i, &val, len * 8 ) + bitmap_for_each( i, &val, len * 8 ) { struct vgic_irq *irq; unsigned long flags; @@ -186,7 +186,7 @@ void vgic_mmio_write_spending(struct vcpu *vcpu, unsigned long flags; irq_desc_t *desc; - for_each_set_bit( i, &val, len * 8 ) + bitmap_for_each( i, &val, len * 8 ) { struct vgic_irq *irq = vgic_get_irq(vcpu->domain, vcpu, intid + i); @@ -234,7 +234,7 @@ void vgic_mmio_write_cpending(struct vcpu *vcpu, unsigned long flags; irq_desc_t *desc; - for_each_set_bit( i, &val, len * 8 ) + bitmap_for_each( i, &val, len * 8 ) { struct vgic_irq *irq = vgic_get_irq(vcpu->domain, vcpu, intid + i); @@ -328,7 +328,7 @@ void vgic_mmio_write_cactive(struct vcpu *vcpu, uint32_t intid = VGIC_ADDR_TO_INTID(addr, 1); unsigned int i; - for_each_set_bit( i, &val, len * 8 ) + bitmap_for_each( i, &val, len * 8 ) { struct vgic_irq *irq = vgic_get_irq(vcpu->domain, vcpu, intid + i); @@ -358,7 +358,7 @@ void vgic_mmio_write_sactive(struct vcpu *vcpu, uint32_t intid = VGIC_ADDR_TO_INTID(addr, 1); unsigned int i; - for_each_set_bit( i, &val, len * 8 ) + bitmap_for_each( i, &val, len * 8 ) { struct vgic_irq *irq = vgic_get_irq(vcpu->domain, vcpu, intid + i); diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c index 304dc20cfab8..cd53bac777dc 100644 --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -157,7 +157,7 @@ static void zero_leaves(struct cpuid_leaf *l, static void sanitise_featureset(uint32_t *fs) { - /* for_each_set_bit() uses unsigned longs. Extend with zeroes. */ + /* bitmap_for_each() uses unsigned longs. Extend with zeroes. */ uint32_t disabled_features[ ROUNDUP(FSCAPINTS, sizeof(unsigned long)/sizeof(uint32_t))] = {}; unsigned int i; @@ -174,8 +174,8 @@ static void sanitise_featureset(uint32_t *fs) disabled_features[i] = ~fs[i] & deep_features[i]; } - for_each_set_bit(i, (void *)disabled_features, - sizeof(disabled_features) * 8) + bitmap_for_each(i, (void *)disabled_features, + sizeof(disabled_features) * 8) { const uint32_t *dfs = x86_cpu_policy_lookup_deep_deps(i); unsigned int j; @@ -237,7 +237,7 @@ static void recalculate_xstate(struct cpu_policy *p) /* Subleafs 2+ */ xstates &= ~XSTATE_FP_SSE; BUILD_BUG_ON(ARRAY_SIZE(p->xstate.comp) < 63); - for_each_set_bit ( i, &xstates, 63 ) + bitmap_for_each ( i, &xstates, 63 ) { /* * Pass through size (eax) and offset (ebx) directly. Visbility of diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index 68cdd8fcf021..da9053c0a262 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -606,7 +606,7 @@ unsigned int xstate_uncompressed_size(uint64_t xcr0) * with respect their index. */ xcr0 &= ~(X86_XCR0_SSE | X86_XCR0_X87); - for_each_set_bit ( i, &xcr0, 63 ) + bitmap_for_each ( i, &xcr0, 63 ) { const struct xstate_component *c = &raw_cpu_policy.xstate.comp[i]; unsigned int s = c->offset + c->size; @@ -634,7 +634,7 @@ unsigned int xstate_compressed_size(uint64_t xstates) * componenets require aligning to 64 first. */ xstates &= ~(X86_XCR0_SSE | X86_XCR0_X87); - for_each_set_bit ( i, &xstates, 63 ) + bitmap_for_each ( i, &xstates, 63 ) { const struct xstate_component *c = &raw_cpu_policy.xstate.comp[i]; diff --git a/xen/include/xen/bitmap.h b/xen/include/xen/bitmap.h index b9f980e91930..5dd7db5be9e7 100644 --- a/xen/include/xen/bitmap.h +++ b/xen/include/xen/bitmap.h @@ -271,6 +271,18 @@ static inline void bitmap_clear(unsigned long *map, unsigned int start, #undef bitmap_switch #undef bitmap_bytes +/** + * bitmap_for_each - iterate over every set bit in a memory region + * @iter: The integer iterator + * @addr: The address to base the search on + * @size: The maximum size to search + */ +#define bitmap_for_each(iter, addr, size) \ + for ( (iter) = find_first_bit(addr, size); \ + (iter) < (size); \ + (iter) = find_next_bit(addr, size, (iter) + 1) ) + + struct xenctl_bitmap; int xenctl_bitmap_to_bitmap(unsigned long *bitmap, const struct xenctl_bitmap *xenctl_bitmap, diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h index 6a5e28730a25..24de0835b7ab 100644 --- a/xen/include/xen/bitops.h +++ b/xen/include/xen/bitops.h @@ -248,17 +248,6 @@ static inline __u32 ror32(__u32 word, unsigned int shift) #define __L16(x) (((x) & 0x0000ff00U) ? ( 8 + __L8( (x) >> 8)) : __L8( x)) #define ilog2(x) (((x) & 0xffff0000U) ? (16 + __L16((x) >> 16)) : __L16(x)) -/** - * for_each_set_bit - iterate over every set bit in a memory region - * @bit: The integer iterator - * @addr: The address to base the search on - * @size: The maximum size to search - */ -#define for_each_set_bit(bit, addr, size) \ - for ( (bit) = find_first_bit(addr, size); \ - (bit) < (size); \ - (bit) = find_next_bit(addr, size, (bit) + 1) ) - #define BIT_WORD(nr) ((nr) / BITS_PER_LONG) #endif /* XEN_BITOPS_H */ From patchwork Tue Jun 25 19:07:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Cooper X-Patchwork-Id: 13711849 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E71E0C30653 for ; Tue, 25 Jun 2024 19:07:43 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.748026.1155600 (Exim 4.92) (envelope-from ) id 1sMBVq-0000sv-11; Tue, 25 Jun 2024 19:07:30 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 748026.1155600; Tue, 25 Jun 2024 19:07:29 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sMBVp-0000sg-SZ; Tue, 25 Jun 2024 19:07:29 +0000 Received: by outflank-mailman (input) for mailman id 748026; Tue, 25 Jun 2024 19:07:28 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sMBVo-0000O1-72 for xen-devel@lists.xenproject.org; Tue, 25 Jun 2024 19:07:28 +0000 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [2a00:1450:4864:20::62e]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 28f46f20-3326-11ef-b4bb-af5377834399; Tue, 25 Jun 2024 21:07:26 +0200 (CEST) Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-a6fd513f18bso500994766b.3 for ; Tue, 25 Jun 2024 12:07:26 -0700 (PDT) Received: from andrewcoop.eng.citrite.net ([160.101.139.1]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a725d7b190fsm180434766b.50.2024.06.25.12.07.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jun 2024 12:07:24 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 28f46f20-3326-11ef-b4bb-af5377834399 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1719342445; x=1719947245; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hlzSBBPH1Fz0L1myZVLvo6vrKf8kQ291bZGAOZ0GC5A=; b=FmdYg87xCxpvnGbBTo2Atgbfbcu+ZvBCuaRsXAJUIVJzLq/2OgiPhKTibAN+TSRp6+ qierkuHmkRKXsWTQa003Jms50RLSnxyO5s+f2ZA8dcRxntGBkVplhcbAEWpYfujacZAd WE4W8SNLA5SsSYZGF/Odkmi9WXcXQQxS/5cIY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719342445; x=1719947245; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hlzSBBPH1Fz0L1myZVLvo6vrKf8kQ291bZGAOZ0GC5A=; b=TWbsYcCSc80bjn873FDm411QgLVIeqgmQ+IXKl/tOivyCzIqLTtq9Nqr0ouc8TT2XD 5MEWhrTvQGsxF6itcm04la0xpV02/Z7QVYYpnELcKv9Y0itiWWc3YhQJeKB4cbKlyi8W tdnFqBZAmDIiPg703lY0F9zFdUwtUJj+U3YwxwzoiO3MNqnsyQZuTDKjXwkthBZyhlgL TviwJ0vsMMtX2Y0f19LGdJpAiMg3Mom1H1NnxGbBSFx4Az8LMjo3dPnkCd45DifiO/bY wynVBLoQ7o+1KPVUhoaYJDYkbM4doUzxmGw9lDi0fQSTcm8WnQbdI56TbtDPph9upM07 iL0g== X-Gm-Message-State: AOJu0Yw++e+9794HuqMzdbV4MaAXkmVzT3wpqG4POmGt65mm9zFWOxyG smnNcWESV6oemD917OpLROlip5QTni+3xNYPgFPtwZdm634UdVwoR6sRWBmJgH0D04OydJuQ7vN AspU= X-Google-Smtp-Source: AGHT+IGuRz/nm8gKauLNQIF038k6w2NBrZxpsGF7hd8yXlcJSlVujS8VN+uqtC7rvaEif9/JFVCvqg== X-Received: by 2002:a17:907:8dc6:b0:a6f:e25d:f6a4 with SMTP id a640c23a62f3a-a7245c642e5mr589513666b.76.1719342445256; Tue, 25 Jun 2024 12:07:25 -0700 (PDT) From: Andrew Cooper To: Xen-devel Cc: Andrew Cooper , Jan Beulich , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Stefano Stabellini , Julien Grall , Volodymyr Babchuk , Bertrand Marquis , Michal Orzel , Oleksii Kurochko Subject: [PATCH for-4.19? 3/6] xen/macros: Introduce BUILD_ERROR() Date: Tue, 25 Jun 2024 20:07:16 +0100 Message-Id: <20240625190719.788643-4-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240625190719.788643-1-andrew.cooper3@citrix.com> References: <20240625190719.788643-1-andrew.cooper3@citrix.com> MIME-Version: 1.0 ... and use it in self-tests.h. This is intended to replace constructs such as __bitop_bad_size(). It produces a better diagnostic, and is MISRA-friendly. Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich --- CC: Jan Beulich CC: Roger Pau Monné CC: Stefano Stabellini CC: Julien Grall CC: Volodymyr Babchuk CC: Bertrand Marquis CC: Michal Orzel CC: Oleksii Kurochko RFC for-4.19. This can be used to not introduce new MISRA violations when adjusting __bitop_bad_size(). It's safe to pull out of this series. --- xen/include/xen/macros.h | 2 ++ xen/include/xen/self-tests.h | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/xen/include/xen/macros.h b/xen/include/xen/macros.h index ec89f4654fcf..8441d7e7d66a 100644 --- a/xen/include/xen/macros.h +++ b/xen/include/xen/macros.h @@ -59,6 +59,8 @@ #define BUILD_BUG_ON(cond) ((void)BUILD_BUG_ON_ZERO(cond)) #endif +#define BUILD_ERROR(msg) asm ( ".error \"" msg "\"" ) + /* Hide a value from the optimiser. */ #define HIDE(x) \ ({ \ diff --git a/xen/include/xen/self-tests.h b/xen/include/xen/self-tests.h index 42a4cc4d17fe..4bc322b7f2a6 100644 --- a/xen/include/xen/self-tests.h +++ b/xen/include/xen/self-tests.h @@ -22,9 +22,9 @@ typeof(fn(val)) real = fn(val); \ \ if ( !__builtin_constant_p(real) ) \ - asm ( ".error \"'" STR(fn(val)) "' not compile-time constant\"" ); \ + BUILD_ERROR("'" STR(fn(val)) "' not compile-time constant"); \ else if ( real != res ) \ - asm ( ".error \"Compile time check '" STR(fn(val) == res) "' failed\"" ); \ + BUILD_ERROR("Compile time check '" STR(fn(val) == res) "' failed"); \ } while ( 0 ) #else #define COMPILE_CHECK(fn, val, res) From patchwork Tue Jun 25 19:07:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Cooper X-Patchwork-Id: 13711851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DE7F5C3065A for ; Tue, 25 Jun 2024 19:07:44 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.748028.1155611 (Exim 4.92) (envelope-from ) id 1sMBVq-00012E-QK; Tue, 25 Jun 2024 19:07:30 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 748028.1155611; Tue, 25 Jun 2024 19:07:30 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sMBVq-0000zB-GJ; Tue, 25 Jun 2024 19:07:30 +0000 Received: by outflank-mailman (input) for mailman id 748028; Tue, 25 Jun 2024 19:07:29 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sMBVp-0000O1-Gg for xen-devel@lists.xenproject.org; Tue, 25 Jun 2024 19:07:29 +0000 Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [2a00:1450:4864:20::12d]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 2a11fae5-3326-11ef-b4bb-af5377834399; Tue, 25 Jun 2024 21:07:28 +0200 (CEST) Received: by mail-lf1-x12d.google.com with SMTP id 2adb3069b0e04-52cdbc20faeso4950898e87.1 for ; Tue, 25 Jun 2024 12:07:28 -0700 (PDT) Received: from andrewcoop.eng.citrite.net ([160.101.139.1]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a725d7b190fsm180434766b.50.2024.06.25.12.07.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jun 2024 12:07:25 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2a11fae5-3326-11ef-b4bb-af5377834399 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1719342446; x=1719947246; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=H7YBcEx5ULWaurBzvfBec/KmEUPaaE087K5DxeWyyao=; b=iKnmxSgK9GuFLtzdObVEAh/XRF+/U+zWC1MliHekshPfCUnh/W66hW7LoUN0vws6vk /8+mfHC2LpUyASvw95bNP77VZE/7EyeR+Ktb+K5/GEVdeMWq5Cg54Mt3cVVWOURNegAZ Z2yvOCPdBN0nnaJaRaLi7yttNAqXYWF6bQR8Q= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719342446; x=1719947246; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H7YBcEx5ULWaurBzvfBec/KmEUPaaE087K5DxeWyyao=; b=Z8OAv1b6j17luvX0ixNX91rnmzxnoighosv8xVRE/S9HxgURMVIEAWdzZ1AXUKb52S JGnchvbVNP+PnLm4O98hceO1aDo/kv4arjh5+/pN2MbBGuaQ8hd9G1ZL5WmeOTtBYjUP oH4fbU49wV8kda0cQ/cQCVBZeAnbiY94RLVoKlPkqE2wk6wzvluHLI4e/X9aYkdfEGfh 0BvUX/Hm2090+p3XJ8o9+92FpFBr6pioT8TaUCq7S0QQd6BQM9ataNdDCsjki9TSds3/ kQW/rif0Dbij5VZROE8I2h7DBVF56zuxlnO49OGDQmFvnI+Nj9Uu5/veDjkgSH8ISZ7Y NZIw== X-Gm-Message-State: AOJu0Yz2y7qRxcsWd63TlmvJk4jNPhMGv81Qr5oZh7ylRITiSJSPeCfg TVyQD0/ldYi2L25oG8Xttz7pqm7j/RirwUiRN4JC6uR1IUIhkTwi08loRsDdWGLBVZLMxTKMj4w 1uYo= X-Google-Smtp-Source: AGHT+IFxAshXGQ3ajeTGLv8A1NSVQvnX+gCWqLJtm6Ui3v8djCzGmMyrzbCLeJIVmfkuYXwx8wQlKw== X-Received: by 2002:ac2:4838:0:b0:52b:c27c:ea1f with SMTP id 2adb3069b0e04-52ce185faa8mr4432110e87.55.1719342446331; Tue, 25 Jun 2024 12:07:26 -0700 (PDT) From: Andrew Cooper To: Xen-devel Cc: Andrew Cooper , Jan Beulich , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Stefano Stabellini , Julien Grall , Volodymyr Babchuk , Bertrand Marquis , Michal Orzel , Oleksii Kurochko Subject: [PATCH 4/6] xen/bitops: Introduce for_each_set_bit() Date: Tue, 25 Jun 2024 20:07:17 +0100 Message-Id: <20240625190719.788643-5-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240625190719.788643-1-andrew.cooper3@citrix.com> References: <20240625190719.788643-1-andrew.cooper3@citrix.com> MIME-Version: 1.0 The prior version (renamed to bitmap_for_each()) was inefficeint when used over a scalar, but this is the more common usage even before accounting for the many opencoded forms. Introduce a new version which operates on scalars only and does so without spilling them to memory. This in turn requires the addition of a type-generic form of ffs(). Add testing for the new construct alongside the ffs/fls testing. Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich --- CC: Jan Beulich CC: Roger Pau Monné CC: Stefano Stabellini CC: Julien Grall CC: Volodymyr Babchuk CC: Bertrand Marquis CC: Michal Orzel CC: Oleksii Kurochko The naming of ffs_g() is taken from the new compiler builtins which are using a g suffix to mean type-generic. --- xen/common/bitops.c | 29 +++++++++++++++++++++++++++++ xen/include/xen/bitops.h | 24 ++++++++++++++++++++++++ 2 files changed, 53 insertions(+) diff --git a/xen/common/bitops.c b/xen/common/bitops.c index 94a8983af99c..9e532f0d87aa 100644 --- a/xen/common/bitops.c +++ b/xen/common/bitops.c @@ -84,8 +84,37 @@ static void __init test_fls(void) CHECK(fls64, 0x8000000000000001ULL, 64); } +static void __init test_for_each_set_bit(void) +{ + unsigned int ui, ui_res = 0; + unsigned long ul, ul_res = 0; + uint64_t ull, ull_res = 0; + + ui = HIDE(0x80008001U); + for_each_set_bit ( i, ui ) + ui_res |= 1U << i; + + if ( ui != ui_res ) + panic("for_each_set_bit(uint) expected %#x, got %#x\n", ui, ui_res); + + ul = HIDE(1UL << (BITS_PER_LONG - 1) | 1); + for_each_set_bit ( i, ul ) + ul_res |= 1UL << i; + + if ( ul != ul_res ) + panic("for_each_set_bit(ulong) expected %#lx, got %#lx\n", ul, ul_res); + + ull = HIDE(0x8000000180000001ULL); + for_each_set_bit ( i, ull ) + ull_res |= 1ULL << i; + + if ( ull != ull_res ) + panic("for_each_set_bit(uint64) expected %#"PRIx64", got %#"PRIx64"\n", ull, ull_res); +} + static void __init __constructor test_bitops(void) { test_ffs(); test_fls(); + test_for_each_set_bit(); } diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h index 24de0835b7ab..84ffcb8d57bc 100644 --- a/xen/include/xen/bitops.h +++ b/xen/include/xen/bitops.h @@ -56,6 +56,16 @@ static always_inline __pure unsigned int ffs64(uint64_t x) return !x || (uint32_t)x ? ffs(x) : ffs(x >> 32) + 32; } +/* + * A type-generic ffs() which picks the appropriate ffs{,l,64}() based on it's + * argument. + */ +#define ffs_g(x) \ + sizeof(x) <= sizeof(int) ? ffs(x) : \ + sizeof(x) <= sizeof(long) ? ffsl(x) : \ + sizeof(x) <= sizeof(uint64_t) ? ffs64(x) : \ + ({ BUILD_ERROR("ffs_g() Bad input type"); 0; }) + static always_inline __pure unsigned int fls(unsigned int x) { if ( __builtin_constant_p(x) ) @@ -92,6 +102,20 @@ static always_inline __pure unsigned int fls64(uint64_t x) } } +/* + * for_each_set_bit() - Iterate over all set bits in a scalar value. + * + * @iter An iterator name. Scoped is within the loop only. + * @val A scalar value to iterate over. + * + * A copy of @val is taken internally. + */ +#define for_each_set_bit(iter, val) \ + for ( typeof(val) __v = (val); __v; ) \ + for ( unsigned int (iter); \ + __v && ((iter) = ffs_g(__v) - 1, true); \ + __v &= __v - 1 ) + /* --------------------- Please tidy below here --------------------- */ #ifndef find_next_bit From patchwork Tue Jun 25 19:07:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Cooper X-Patchwork-Id: 13711852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E7DD3C3065D for ; Tue, 25 Jun 2024 19:07:45 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.748027.1155606 (Exim 4.92) (envelope-from ) id 1sMBVq-0000vw-DU; Tue, 25 Jun 2024 19:07:30 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 748027.1155606; Tue, 25 Jun 2024 19:07:30 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sMBVq-0000vB-4c; Tue, 25 Jun 2024 19:07:30 +0000 Received: by outflank-mailman (input) for mailman id 748027; Tue, 25 Jun 2024 19:07:28 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sMBVo-00008w-O0 for xen-devel@lists.xenproject.org; Tue, 25 Jun 2024 19:07:28 +0000 Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [2a00:1450:4864:20::52b]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 2a229ede-3326-11ef-90a3-e314d9c70b13; Tue, 25 Jun 2024 21:07:28 +0200 (CEST) Received: by mail-ed1-x52b.google.com with SMTP id 4fb4d7f45d1cf-57d07f07a27so6758519a12.3 for ; Tue, 25 Jun 2024 12:07:28 -0700 (PDT) Received: from andrewcoop.eng.citrite.net ([160.101.139.1]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a725d7b190fsm180434766b.50.2024.06.25.12.07.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jun 2024 12:07:26 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2a229ede-3326-11ef-90a3-e314d9c70b13 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1719342447; x=1719947247; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Fy0Ve82OeeZoLYNkYstdewJQpe9Npq/P9dEBPwXzBhw=; b=GKx6sqQ7t+Feqy/jhMpBsGnVSZXS2svgsxl36T2F3tr/R9XUveQFDZFIYiYk6dHc/E 9u4TueOLdLnM3JCmVnsoZJXpU+ZNYrDDGDVn19dAZcujvcRgRHujQ7BTOExFFXVHc2Fz Y1dktfqhQAC5Zki8BxfMFYTM+F6WLW4qWfMNo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719342447; x=1719947247; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Fy0Ve82OeeZoLYNkYstdewJQpe9Npq/P9dEBPwXzBhw=; b=whrnoK4KC1HmmOdMQNICN0KzC3YUFUoP2noi+bl+FMt1+IDSFhbZlUGFMJXPrOJEoH IvndQaiQxYRz4HA6e/mMUD/l1tvgiMX8CESyVhOEAOUBCwXsTEkfMUEDRlCAlFx/Ecyv 0mdLG1EMiToo1jQ3bnP3vZ1eoEFNBQLrrZ/QdQxGEmdvULHlAVYIkF2K1+KwUCT917cz T35KfuUmJ6WErgzEzJRqWETKbW4Hvz/fTOENRPLZxRWPmyHpFN6xpHj82bVlJXmP1xvF 8OEy8zgTazpiNlyxuO/8JN7CAhFnyeEvSJz6bRc0n7JZzDe9O6QiDBPTRRh8JwwszYMn CsUw== X-Gm-Message-State: AOJu0Yyp+fXgGRERcPF3ICWSEk23rSYVFyg48z2tAajpEhGHFORT4YCK IXyFK/n2XioLnjUwz1wWL55YIJni0GigicvIcR1BHDlhmHP1gvhLGkOA/Vj1AYaDtOhsj80zBzB aQBU= X-Google-Smtp-Source: AGHT+IFtdpbsqg8qlXa1oBwhSkY9AFdg41EFbjwtjMAZJsCOUe08bxR94vYnzxzecRZqHGxMo1Bgew== X-Received: by 2002:a17:907:c78e:b0:a72:4b31:13b5 with SMTP id a640c23a62f3a-a727f855270mr120272966b.54.1719342447117; Tue, 25 Jun 2024 12:07:27 -0700 (PDT) From: Andrew Cooper To: Xen-devel Cc: Andrew Cooper , Stefano Stabellini , Julien Grall , Volodymyr Babchuk , Bertrand Marquis , Michal Orzel Subject: [PATCH 5/6] ARM/vgic: Use for_each_set_bit() in vgic_set_irqs_pending() Date: Tue, 25 Jun 2024 20:07:18 +0100 Message-Id: <20240625190719.788643-6-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240625190719.788643-1-andrew.cooper3@citrix.com> References: <20240625190719.788643-1-andrew.cooper3@citrix.com> MIME-Version: 1.0 ... which is better optimised for scalar values, rather than using the arbitrary-sized bitmap helpers. For ARM32: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-16 (-16) Function old new delta vgic_set_irqs_pending 284 268 -16 including removing calls to _find_{first,next}_bit_le(), and two stack-spilled words too. For ARM64: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-40 (-40) Function old new delta vgic_set_irqs_pending 268 228 -40 including removing three calls to find_next_bit(). Signed-off-by: Andrew Cooper Reviewed-by: Michal Orzel --- CC: Stefano Stabellini CC: Julien Grall CC: Volodymyr Babchuk CC: Bertrand Marquis CC: Michal Orzel TODO: These were debug builds, and I need to redo the analysis with release builds. Also extend to the other examples. --- xen/arch/arm/vgic.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index 57519e834d78..c060676aee78 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -421,15 +421,13 @@ void vgic_enable_irqs(struct vcpu *v, uint32_t r, unsigned int n) void vgic_set_irqs_pending(struct vcpu *v, uint32_t r, unsigned int rank) { - const unsigned long mask = r; - unsigned int i; /* The first rank is always per-vCPU */ bool private = rank == 0; /* LPIs will never be set pending via this function */ ASSERT(!is_lpi(32 * rank + 31)); - bitmap_for_each( i, &mask, 32 ) + for_each_set_bit ( i, r ) { unsigned int irq = i + 32 * rank; From patchwork Tue Jun 25 19:07:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Cooper X-Patchwork-Id: 13711850 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E729DC41513 for ; Tue, 25 Jun 2024 19:07:43 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.748029.1155629 (Exim 4.92) (envelope-from ) id 1sMBVs-0001be-5x; Tue, 25 Jun 2024 19:07:32 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 748029.1155629; Tue, 25 Jun 2024 19:07:32 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sMBVs-0001aR-1y; Tue, 25 Jun 2024 19:07:32 +0000 Received: by outflank-mailman (input) for mailman id 748029; Tue, 25 Jun 2024 19:07:29 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sMBVp-00008w-O3 for xen-devel@lists.xenproject.org; Tue, 25 Jun 2024 19:07:29 +0000 Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [2a00:1450:4864:20::631]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 2aa255d4-3326-11ef-90a3-e314d9c70b13; Tue, 25 Jun 2024 21:07:29 +0200 (CEST) Received: by mail-ej1-x631.google.com with SMTP id a640c23a62f3a-a6fd513f18bso501003766b.3 for ; Tue, 25 Jun 2024 12:07:29 -0700 (PDT) Received: from andrewcoop.eng.citrite.net ([160.101.139.1]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a725d7b190fsm180434766b.50.2024.06.25.12.07.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jun 2024 12:07:27 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2aa255d4-3326-11ef-90a3-e314d9c70b13 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1719342448; x=1719947248; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=IqTAjuFXDg4PchRKJS8PgwLD1p7wjSEO9wUcu/jl/lg=; b=aF7Eudbs/pvxagbLa98MU3eidR/QdZZTGQG2uELKl4t9gnJr/60LB8peKiHyLgMd7Y c3AF5sFwInypUhNvKwNcLI83w/7l9iijfJ29p+edAdxs4WGgmub0tduDm4WKEj/oPXNw xZ7xxmwxSiUKHZrKBwSub7JhfZ7UlXD40XQt0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719342448; x=1719947248; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IqTAjuFXDg4PchRKJS8PgwLD1p7wjSEO9wUcu/jl/lg=; b=sN0bwQ/D8zYAvvlxgFCLpa1GOdRiUDCAQiVjtDagixpIOiByGEN82nrvL28itQe1fz oULq0dnp7nx7sATmBvtHgbAH4jxyzj/vVb+CCih7t+egT2GW9Oi0YbMrp2E/qfnNDfud SYpQxt8gieONFxu4ugijtdNelkQOQv8CahozdYFyV0rIk3kOWPqHcKjR+e1oF2HTm5S7 nSz3d3aMVjs18khVXTYFJFgKziunRk7fog5Z7n+nEF6tiSTzuYd53Tkib9SINF6WGJYt lWt1H4m2kltdVqbGmRBAFns7uxEdWqTI+o1YEUugqXnX8njoBNG41ez4+0I17Uike6SV 4Xhw== X-Gm-Message-State: AOJu0YziNW1sXQ0aZlBqpMUxWN/Q8Dzb0zJtgKJzr8XOTTqy0d+844Ht Q0Zbg0owXuPzvsh30VIMT80Vwo/6CeJH+nRkeUyvhEaCmBWCrsb2+LGdcel1A76BhOS3dfJthxS MlKs= X-Google-Smtp-Source: AGHT+IEXoA2JRkMlTSVwVKWS+vj658ux8ClHQgfVG80Prd6QW4t1IN1zGIW7wD48aUpC2pYpxZkHdg== X-Received: by 2002:a17:906:1995:b0:a6e:fa0a:4899 with SMTP id a640c23a62f3a-a7245b45affmr505431466b.16.1719342448411; Tue, 25 Jun 2024 12:07:28 -0700 (PDT) From: Andrew Cooper To: Xen-devel Cc: Andrew Cooper , Jan Beulich , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Stefano Stabellini , Julien Grall , Volodymyr Babchuk , Bertrand Marquis , Michal Orzel , Oleksii Kurochko Subject: [PATCH 6/6] x86/xstate: Switch back to for_each_set_bit() Date: Tue, 25 Jun 2024 20:07:19 +0100 Message-Id: <20240625190719.788643-7-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240625190719.788643-1-andrew.cooper3@citrix.com> References: <20240625190719.788643-1-andrew.cooper3@citrix.com> MIME-Version: 1.0 In all 3 examples, we're iterating over a scaler. No caller can pass the COMPRESSED flag in, so the upper bound of 63, as opposed to 64, doesn't matter. This alone produces: add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-161 (-161) Function old new delta compress_xsave_states 66 58 -8 xstate_uncompressed_size 119 71 -48 xstate_compressed_size 124 76 -48 recalculate_xstate 347 290 -57 where xstate_{un,}compressed_size() have practically halved in size despite being small before. The change in compress_xsave_states() is unexpected. The function is almost entirely dead code, and within what remains there's a smaller stack frame. I suspect it's leftovers that the optimiser couldn't fully discard. Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich --- CC: Jan Beulich CC: Roger Pau Monné CC: Stefano Stabellini CC: Julien Grall CC: Volodymyr Babchuk CC: Bertrand Marquis CC: Michal Orzel CC: Oleksii Kurochko --- xen/arch/x86/cpu-policy.c | 4 ++-- xen/arch/x86/xstate.c | 8 ++++---- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c index cd53bac777dc..fa55f6073089 100644 --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -193,7 +193,7 @@ static void sanitise_featureset(uint32_t *fs) static void recalculate_xstate(struct cpu_policy *p) { uint64_t xstates = XSTATE_FP_SSE; - unsigned int i, ecx_mask = 0, Da1 = p->xstate.Da1; + unsigned int ecx_mask = 0, Da1 = p->xstate.Da1; /* * The Da1 leaf is the only piece of information preserved in the common @@ -237,7 +237,7 @@ static void recalculate_xstate(struct cpu_policy *p) /* Subleafs 2+ */ xstates &= ~XSTATE_FP_SSE; BUILD_BUG_ON(ARRAY_SIZE(p->xstate.comp) < 63); - bitmap_for_each ( i, &xstates, 63 ) + for_each_set_bit ( i, xstates ) { /* * Pass through size (eax) and offset (ebx) directly. Visbility of diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index da9053c0a262..88dbfbeafacd 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -589,7 +589,7 @@ static bool valid_xcr0(uint64_t xcr0) unsigned int xstate_uncompressed_size(uint64_t xcr0) { - unsigned int size = XSTATE_AREA_MIN_SIZE, i; + unsigned int size = XSTATE_AREA_MIN_SIZE; /* Non-XCR0 states don't exist in an uncompressed image. */ ASSERT((xcr0 & ~X86_XCR0_STATES) == 0); @@ -606,7 +606,7 @@ unsigned int xstate_uncompressed_size(uint64_t xcr0) * with respect their index. */ xcr0 &= ~(X86_XCR0_SSE | X86_XCR0_X87); - bitmap_for_each ( i, &xcr0, 63 ) + for_each_set_bit ( i, xcr0 ) { const struct xstate_component *c = &raw_cpu_policy.xstate.comp[i]; unsigned int s = c->offset + c->size; @@ -621,7 +621,7 @@ unsigned int xstate_uncompressed_size(uint64_t xcr0) unsigned int xstate_compressed_size(uint64_t xstates) { - unsigned int i, size = XSTATE_AREA_MIN_SIZE; + unsigned int size = XSTATE_AREA_MIN_SIZE; if ( xstates == 0 ) return 0; @@ -634,7 +634,7 @@ unsigned int xstate_compressed_size(uint64_t xstates) * componenets require aligning to 64 first. */ xstates &= ~(X86_XCR0_SSE | X86_XCR0_X87); - bitmap_for_each ( i, &xstates, 63 ) + for_each_set_bit ( i, xstates ) { const struct xstate_component *c = &raw_cpu_policy.xstate.comp[i];