From patchwork Sun Nov 28 03:56:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yury Norov X-Patchwork-Id: 12642657 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E21F6C433F5 for ; Sun, 28 Nov 2021 03:57:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E527F6B0075; Sat, 27 Nov 2021 22:57:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DB1F76B0078; Sat, 27 Nov 2021 22:57:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BDDA76B007B; Sat, 27 Nov 2021 22:57:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0151.hostedemail.com [216.40.44.151]) by kanga.kvack.org (Postfix) with ESMTP id A88656B0075 for ; Sat, 27 Nov 2021 22:57:18 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4B034184571CE for ; Sun, 28 Nov 2021 03:57:08 +0000 (UTC) X-FDA: 78856978536.30.153DA4E Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by imf28.hostedemail.com (Postfix) with ESMTP id D26B890000AA for ; Sun, 28 Nov 2021 03:57:07 +0000 (UTC) Received: by mail-qt1-f180.google.com with SMTP id j17so12918404qtx.2 for ; Sat, 27 Nov 2021 19:57:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=fJ8xfnVA+NPUIa5t4yWgB58hLT4DLWUqrE9DXdtQP+o=; b=a5JTXAi8QEj68vc4ha+mPXa/ak1/OEEuTBec3AMtr1hEzo8fmCz3oIwEmN3K8kvQYB 67gv/rNWFdzwHiZFz09bpXG9HR51wjgigxUHIjdtAoVE4YB1vUAdNadlYIOtRZECxgba qt3C1oVJF96hYIQdDKlDO0FBz0TCdg6G1jktJ3RCIVLkJZMOcXupEqyjZfFMvq2MAme6 UgcKwKoiTAaN0noh2ql7FQQ0Jxhmh2WckqBiI0uS0tM+dBCqWKfFUMxoZTvkiS3PR3wT /2lsgPXrHRO/TwsRXE/k4NEjrGiabGxA6/xpgMyYfvSqIbYUO/V0Hzjl329Ng6juxWHT KZPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=fJ8xfnVA+NPUIa5t4yWgB58hLT4DLWUqrE9DXdtQP+o=; b=n4lWt+eHzhBv89d8KfzHL0ClCDqhCndd752uXy+j5t/D2mMGBbHigINih7eEzZLC8e LxCixqLeAsrEZWyn9vUOO1/HRLU6AfybAUaTncMr4/hAW04fYZSaWL3hJazMy/eJdYbl IBiAy0R+cLsrxaevh9zGqFw3swp9jQDCQ0aOYRUWHMmsBq2Ubw6tXCg9Lw0NAtK93oCr XD4mSeymJ7vfvQTkN86VT6+UQl+ATKIPof5rRNhYRe91hO7LZ92IAtNVouJNwoAWQg1A xbYmd/nY+uhCs/qAQHcwEJdKoCrngPkiwEk3sSpt7RtmrQ2bzIn8LXHQwj5pclGSFPC9 1MOw== X-Gm-Message-State: AOAM533hEL2q2PoBk4lxINWKAaXnuqkIpj/1GVwNsZON04W3nCP7I/3v +KxuhyIxpQoVwW/zIFAO3X4= X-Google-Smtp-Source: ABdhPJy1HODbinFYUK0hw6hMtPG5eVjRVeAWjrsAbI29+G07rR15rsN4J5IN8e01ddnXvrZIRqJNZg== X-Received: by 2002:ac8:7fc2:: with SMTP id b2mr35277966qtk.114.1638071826940; Sat, 27 Nov 2021 19:57:06 -0800 (PST) Received: from localhost ([66.216.211.25]) by smtp.gmail.com with ESMTPSA id x17sm6473647qta.66.2021.11.27.19.57.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 27 Nov 2021 19:57:06 -0800 (PST) From: Yury Norov To: linux-kernel@vger.kernel.org, Yury Norov , "James E.J. Bottomley" , "Martin K. Petersen" , "Paul E. McKenney" , "Rafael J. Wysocki" , Alexander Shishkin , Alexey Klimov , Amitkumar Karwar , Andi Kleen , Andrew Lunn , Andrew Morton , Andy Gross , Andy Lutomirski , Andy Shevchenko , Anup Patel , Ard Biesheuvel , Arnaldo Carvalho de Melo , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christoph Hellwig , Christoph Lameter , Daniel Vetter , Dave Hansen , David Airlie , David Laight , Dennis Zhou , Dinh Nguyen , Geetha sowjanya , Geert Uytterhoeven , Greg Kroah-Hartman , Guo Ren , Hans de Goede , Heiko Carstens , Ian Rogers , Ingo Molnar , Jakub Kicinski , Jason Wessel , Jens Axboe , Jiri Olsa , Jonathan Cameron , Juri Lelli , Kalle Valo , Kees Cook , Krzysztof Kozlowski , Lee Jones , Marc Zyngier , Marcin Wojtas , Mark Gross , Mark Rutland , Matti Vaittinen , Mauro Carvalho Chehab , Mel Gorman , Michael Ellerman , Mike Marciniszyn , Nicholas Piggin , Palmer Dabbelt , Peter Zijlstra , Petr Mladek , Randy Dunlap , Rasmus Villemoes , Roy Pledge , Russell King , Saeed Mahameed , Sagi Grimberg , Sergey Senozhatsky , Solomon Peachy , Stephen Boyd , Stephen Rothwell , Steven Rostedt , Subbaraya Sundeep , Sudeep Holla , Sunil Goutham , Tariq Toukan , Tejun Heo , Thomas Bogendoerfer , Thomas Gleixner , Ulf Hansson , Vincent Guittot , Vineet Gupta , Viresh Kumar , Vivien Didelot , Vlastimil Babka , Will Deacon , bcm-kernel-feedback-list@broadcom.com, kvm@vger.kernel.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-crypto@vger.kernel.org, linux-csky@vger.kernel.org, linux-ia64@vger.kernel.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-perf-users@vger.kernel.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-snps-arc@lists.infradead.org, linuxppc-dev@lists.ozlabs.org Subject: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage Date: Sat, 27 Nov 2021 19:56:55 -0800 Message-Id: <20211128035704.270739-1-yury.norov@gmail.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: D26B890000AA X-Stat-Signature: pak8xddmaxxxsuryf9kdp961foubrx7y Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=a5JTXAi8; spf=pass (imf28.hostedemail.com: domain of yury.norov@gmail.com designates 209.85.160.180 as permitted sender) smtp.mailfrom=yury.norov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-HE-Tag: 1638071827-888088 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In many cases people use bitmap_weight()-based functions like this: if (num_present_cpus() > 1) do_something(); This may take considerable amount of time on many-cpus machines because num_present_cpus() will traverse every word of underlying cpumask unconditionally. We can significantly improve on it for many real cases if stop traversing the mask as soon as we count present cpus to any number greater than 1: if (num_present_cpus_gt(1)) do_something(); To implement this idea, the series adds bitmap_weight_{eq,gt,le} functions together with corresponding wrappers in cpumask and nodemask. Yury Norov (9): lib/bitmap: add bitmap_weight_{eq,gt,le} lib/bitmap: implement bitmap_{empty,full} with bitmap_weight_eq() all: replace bitmap_weigth() with bitmap_{empty,full,eq,gt,le} tools: sync bitmap_weight() usage with the kernel lib/cpumask: add cpumask_weight_{eq,gt,le} lib/nodemask: add nodemask_weight_{eq,gt,le} lib/cpumask: add num_{possible,present,active}_cpus_{eq,gt,le} lib/nodemask: add num_node_state_eq() MAINTAINERS: add cpumask and nodemask files to BITMAP_API MAINTAINERS | 4 ++ arch/alpha/kernel/process.c | 2 +- arch/arc/kernel/smp.c | 2 +- arch/arm/kernel/machine_kexec.c | 2 +- arch/arm/mach-exynos/exynos.c | 2 +- arch/arm/mm/cache-b15-rac.c | 2 +- arch/arm64/kernel/smp.c | 2 +- arch/arm64/mm/context.c | 2 +- arch/csky/mm/asid.c | 2 +- arch/csky/mm/context.c | 2 +- arch/ia64/kernel/setup.c | 2 +- arch/ia64/mm/tlb.c | 8 +-- arch/mips/cavium-octeon/octeon-irq.c | 4 +- arch/mips/kernel/crash.c | 2 +- arch/mips/kernel/i8253.c | 2 +- arch/mips/kernel/perf_event_mipsxx.c | 4 +- arch/mips/kernel/rtlx-cmp.c | 2 +- arch/mips/kernel/smp.c | 4 +- arch/mips/kernel/vpe-cmp.c | 2 +- .../loongson2ef/common/cs5536/cs5536_mfgpt.c | 2 +- arch/mips/mm/context.c | 2 +- arch/mips/mm/tlbex.c | 2 +- arch/nds32/kernel/perf_event_cpu.c | 4 +- arch/nios2/kernel/cpuinfo.c | 2 +- arch/powerpc/kernel/smp.c | 2 +- arch/powerpc/kernel/watchdog.c | 4 +- arch/powerpc/platforms/85xx/smp.c | 2 +- arch/powerpc/platforms/pseries/hotplug-cpu.c | 4 +- arch/powerpc/sysdev/mpic.c | 2 +- arch/powerpc/xmon/xmon.c | 10 +-- arch/riscv/kvm/vmid.c | 2 +- arch/s390/kernel/perf_cpum_cf.c | 2 +- arch/sparc/kernel/mdesc.c | 6 +- arch/x86/events/amd/core.c | 2 +- arch/x86/kernel/alternative.c | 8 +-- arch/x86/kernel/apic/apic.c | 4 +- arch/x86/kernel/apic/apic_flat_64.c | 2 +- arch/x86/kernel/apic/probe_32.c | 2 +- arch/x86/kernel/cpu/mce/dev-mcelog.c | 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 18 +++--- arch/x86/kernel/hpet.c | 2 +- arch/x86/kernel/i8253.c | 2 +- arch/x86/kernel/kvm.c | 2 +- arch/x86/kernel/kvmclock.c | 2 +- arch/x86/kernel/smpboot.c | 4 +- arch/x86/kernel/tsc.c | 2 +- arch/x86/kvm/hyperv.c | 8 +-- arch/x86/mm/amdtopology.c | 2 +- arch/x86/mm/mmio-mod.c | 2 +- arch/x86/mm/numa_emulation.c | 4 +- arch/x86/platform/uv/uv_nmi.c | 2 +- arch/x86/xen/smp_pv.c | 2 +- arch/x86/xen/spinlock.c | 2 +- drivers/acpi/numa/srat.c | 2 +- drivers/clk/samsung/clk-exynos4.c | 2 +- drivers/clocksource/ingenic-timer.c | 3 +- drivers/cpufreq/pcc-cpufreq.c | 2 +- drivers/cpufreq/qcom-cpufreq-hw.c | 2 +- drivers/cpufreq/scmi-cpufreq.c | 2 +- drivers/crypto/ccp/ccp-dev-v5.c | 5 +- drivers/dma/mv_xor.c | 5 +- drivers/firmware/psci/psci_checker.c | 2 +- drivers/gpu/drm/i810/i810_drv.c | 2 +- drivers/gpu/drm/i915/i915_pmu.c | 2 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c | 2 +- drivers/hv/channel_mgmt.c | 4 +- drivers/iio/adc/mxs-lradc-adc.c | 3 +- drivers/iio/dummy/iio_simple_dummy_buffer.c | 4 +- drivers/iio/industrialio-buffer.c | 2 +- drivers/iio/industrialio-trigger.c | 2 +- drivers/infiniband/hw/hfi1/affinity.c | 13 ++-- drivers/infiniband/hw/qib/qib_file_ops.c | 2 +- drivers/infiniband/hw/qib/qib_iba7322.c | 2 +- drivers/infiniband/sw/siw/siw_main.c | 3 +- drivers/irqchip/irq-bcm6345-l1.c | 2 +- drivers/irqchip/irq-gic.c | 2 +- drivers/memstick/core/ms_block.c | 4 +- drivers/net/caif/caif_virtio.c | 2 +- drivers/net/dsa/b53/b53_common.c | 2 +- drivers/net/ethernet/broadcom/bcmsysport.c | 6 +- .../cavium/liquidio/cn23xx_vf_device.c | 2 +- drivers/net/ethernet/hisilicon/hns/hns_enet.c | 2 +- .../net/ethernet/intel/ice/ice_virtchnl_pf.c | 4 +- .../net/ethernet/intel/ixgbe/ixgbe_sriov.c | 2 +- .../net/ethernet/marvell/mvpp2/mvpp2_main.c | 2 +- .../marvell/octeontx2/nic/otx2_ethtool.c | 2 +- .../marvell/octeontx2/nic/otx2_flows.c | 8 +-- .../ethernet/marvell/octeontx2/nic/otx2_pf.c | 2 +- drivers/net/ethernet/mellanox/mlx4/cmd.c | 10 +-- drivers/net/ethernet/mellanox/mlx4/eq.c | 4 +- drivers/net/ethernet/mellanox/mlx4/main.c | 2 +- .../ethernet/mellanox/mlx5/core/en_ethtool.c | 2 +- drivers/net/ethernet/qlogic/qed/qed_dev.c | 3 +- drivers/net/ethernet/qlogic/qed/qed_rdma.c | 4 +- drivers/net/ethernet/qlogic/qed/qed_roce.c | 2 +- drivers/net/wireless/ath/ath9k/hw.c | 2 +- drivers/net/wireless/marvell/mwifiex/main.c | 4 +- drivers/net/wireless/st/cw1200/queue.c | 3 +- drivers/nvdimm/region.c | 2 +- drivers/nvme/host/pci.c | 2 +- drivers/perf/arm-cci.c | 2 +- drivers/perf/arm_pmu.c | 6 +- drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +- drivers/perf/thunderx2_pmu.c | 3 +- drivers/perf/xgene_pmu.c | 2 +- .../intel/speed_select_if/isst_if_common.c | 6 +- drivers/pwm/pwm-pca9685.c | 2 +- drivers/scsi/lpfc/lpfc_init.c | 2 +- drivers/soc/bcm/brcmstb/biuctrl.c | 2 +- drivers/soc/fsl/dpio/dpio-service.c | 4 +- drivers/soc/fsl/qbman/qman_test_stash.c | 2 +- drivers/spi/spi-dw-bt1.c | 2 +- drivers/staging/media/tegra-video/vi.c | 2 +- drivers/thermal/intel/intel_powerclamp.c | 10 ++- drivers/virt/acrn/hsm.c | 2 +- fs/ocfs2/cluster/heartbeat.c | 14 ++--- fs/xfs/xfs_sysfs.c | 2 +- include/linux/bitmap.h | 45 ++++++++++--- include/linux/cpumask.h | 55 ++++++++++++++++ include/linux/kdb.h | 2 +- include/linux/nodemask.h | 29 +++++++++ kernel/debug/kdb/kdb_bt.c | 2 +- kernel/irq/affinity.c | 2 +- kernel/padata.c | 2 +- kernel/printk/printk.c | 2 +- kernel/rcu/tree_nocb.h | 4 +- kernel/rcu/tree_plugin.h | 2 +- kernel/reboot.c | 4 +- kernel/sched/core.c | 10 +-- kernel/sched/topology.c | 4 +- kernel/time/clockevents.c | 4 +- kernel/time/clocksource.c | 2 +- lib/bitmap.c | 63 +++++++++++++++++++ mm/mempolicy.c | 2 +- mm/page_alloc.c | 2 +- mm/percpu.c | 6 +- mm/slab.c | 2 +- mm/vmstat.c | 4 +- tools/include/linux/bitmap.h | 42 ++++++++++--- tools/lib/bitmap.c | 60 ++++++++++++++++++ tools/perf/builtin-c2c.c | 4 +- tools/perf/util/pmu.c | 2 +- 142 files changed, 490 insertions(+), 251 deletions(-)