From patchwork Thu Feb 29 23:21:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Samuel Holland X-Patchwork-Id: 13577705 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9E18C54798 for ; Thu, 29 Feb 2024 23:22:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 63AC26B0092; Thu, 29 Feb 2024 18:22:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5EB316B0093; Thu, 29 Feb 2024 18:22:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4B2526B0096; Thu, 29 Feb 2024 18:22:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3C5A26B0092 for ; Thu, 29 Feb 2024 18:22:16 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E0A2EA15CC for ; Thu, 29 Feb 2024 23:22:15 +0000 (UTC) X-FDA: 81846417030.25.69E9F5E Received: from mail-oi1-f175.google.com (mail-oi1-f175.google.com [209.85.167.175]) by imf10.hostedemail.com (Postfix) with ESMTP id 0DCE3C001C for ; Thu, 29 Feb 2024 23:22:13 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=sifive.com header.s=google header.b=U6BfH3Jz; dmarc=pass (policy=reject) header.from=sifive.com; spf=pass (imf10.hostedemail.com: domain of samuel.holland@sifive.com designates 209.85.167.175 as permitted sender) smtp.mailfrom=samuel.holland@sifive.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709248934; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=gYpxFJdLsUY33jWloX4oa7IMcbdSQ5C7SOjzy7kQJJw=; b=f04er6mXie8yCRro6l6Pi0IKSmE9xGwMCmW8Zm51IHEVRwypfP1cACFh9xAOabsmAJNJ9R p2WLi8YJT884ksdKD9y/JmOQyoGcYBzPWYOK5AZ4ziQMNeuPOTvpDqOKKUXD+Qj2gwDXan fZWaVVze2EXtgZssrlWZDzWUz4hJxvo= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=sifive.com header.s=google header.b=U6BfH3Jz; dmarc=pass (policy=reject) header.from=sifive.com; spf=pass (imf10.hostedemail.com: domain of samuel.holland@sifive.com designates 209.85.167.175 as permitted sender) smtp.mailfrom=samuel.holland@sifive.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709248934; a=rsa-sha256; cv=none; b=DQZM87BYYAY8WLw3NatzhtjOSB9GaumkMkibAmK1hcITTG6myuA5OKIEfzyyh6RQXScH0P +byHkrdnCi/YfoEqUeJocuvc/lOWeAihu0GTt1SAUYH+Ih3N7g1iPChvBhLTQdG78txyu9 n4xfZDCoTJAHvcKkxAIhhuWB6QmMRKA= Received: by mail-oi1-f175.google.com with SMTP id 5614622812f47-3c1c51f2fb1so677741b6e.0 for ; Thu, 29 Feb 2024 15:22:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1709248933; x=1709853733; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=gYpxFJdLsUY33jWloX4oa7IMcbdSQ5C7SOjzy7kQJJw=; b=U6BfH3Jz+sVEY5CDInMuFNGtX4k+IeGz9+KEnQwZhbIr7BdeJ5nVETGlkOPNPQgK+b gZhlmSHHWv5Lv6bal0uyTr9EzhXuljtE1YNJf4UOoPHDNfVj6/yEq/4pxGj+gS3GTQPV yfQEwfwImOlILfhlnaO3EzbtDMKYzVxQWdgXMRI3wfWb1x3Pyn4jJ1kOGMIPstZRnSyg /GhWifzc79Ugrx/pPc7NKI3Jcd685b20zFIa/psyHmatUCq2/QdKPoZM8rkcAfD0iHYA +c2klr36mcfx3l+D4Jd9oKJq+hcWo0UjnsxiMSXeF/a4PoGU0a8Q+fdlRBRYl5abDZsB X7Ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709248933; x=1709853733; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=gYpxFJdLsUY33jWloX4oa7IMcbdSQ5C7SOjzy7kQJJw=; b=qFGtMw+Qu7lj4VnzzKqxzkd3gJzJu1bigtA/32jhd9o6MmwNmPWgC7N8mU3hEs/Ion B9BMOwmh4rgdmXk7hqThUG6XNSO5/IVccvEVTjmxx206lrtV9tLxD2apoqQccjbK3dE5 Dea3pN7k/v1pSsN4kbMMKzVzr1iIOV+6l4GycHp79kSmi6SMI4fL74hcdEl22+uwpTr2 r48NqIAir6X5C7g6JYl81lM9cubJk+FGcq39swx9VhTfMCYNgsh8zaPDedcySh8UTE3n 3ixBY4uKbCH7mv+gXnxo3Nl2kUT1TnTzCm4d8oLk69zwaebLFG944Y/5egOndaDpEnae qR8Q== X-Forwarded-Encrypted: i=1; AJvYcCXLJkGG45xEW9RUfkJJBYocG/owrCI7Po3V1ydzy8/7oCxOgJhIrDUaqjkkDJSvt52wh96EB3t1caVXhoRgq9DIyhs= X-Gm-Message-State: AOJu0Yz4bdVQ0cP3AqhUqb1ZM02GtqE0jjkywNUWRNgJDuO+AA8fPPOa aWkhBis9JdxucG39vsGTCEs9Tu7sXVvj3cNmsivgkZpcxRR4GZxyMmS3aL5lu9k= X-Google-Smtp-Source: AGHT+IEbMzGg89JeXwrCbKoTrQtwzZydngrB60RIdlIeQk1Zbv0hmu3BVs2SCgNJATrXU3oE9uJ2wA== X-Received: by 2002:a05:6808:1a83:b0:3c1:d0a2:9220 with SMTP id bm3-20020a0568081a8300b003c1d0a29220mr68801oib.24.1709248933079; Thu, 29 Feb 2024 15:22:13 -0800 (PST) Received: from sw06.internal.sifive.com ([4.53.31.132]) by smtp.gmail.com with ESMTPSA id c6-20020aa78806000000b006e55aa75d6csm1779719pfo.122.2024.02.29.15.22.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Feb 2024 15:22:12 -0800 (PST) From: Samuel Holland To: Palmer Dabbelt , linux-riscv@lists.infradead.org Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Alexandre Ghiti , Jisheng Zhang , Yunhui Cui , Samuel Holland Subject: [PATCH v5 00/13] riscv: ASID-related and UP-related TLB flush enhancements Date: Thu, 29 Feb 2024 15:21:41 -0800 Message-ID: <20240229232211.161961-1-samuel.holland@sifive.com> X-Mailer: git-send-email 2.43.1 MIME-Version: 1.0 X-Rspamd-Queue-Id: 0DCE3C001C X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: tzxgekij37uihwga3zikjebyg1b7duqk X-HE-Tag: 1709248933-755941 X-HE-Meta: U2FsdGVkX18ZgmL6oHSdafTWd2t8OumUgHd3KDSaJGqQMn6E1bv2Z5o3nufJ8hs4YYBZCuhF12XXjz0nRk9Kpj0XS1mBA+12WU9Sbv+ukH5B4f6o3WU6JUST4rDH6fSwvh1BCQNPrdzbbxw3D10Rnc8SWK/aRUhllr6K4TTtvk/UxCQ0JOgKLkPUGY9O+SLC6XYWn+bbJqol+GkZMKsd5wfbI4Z9x+1jF6/Ab4JnHllNdCSnYLueA9Die+3LYczalyAlFC6XnnGf9/rFoYPlXHxpqypqCCWJNLlyO5CdOfg7DTO3X3335jM8QYY1Oc4UWLFUPMt3IkZccFXGKp+lDzWPcuO7QXbt2K48p8gmMYuPHwJfoxLqd5hGAAygbsLbvCtRijbUZ/ERljwq81srgJ0DKGaREH88IYFWfiYiVNrBrHEARbDFSf9uWUG2Pk1Zvg+9JWX6ksY4P9+0eumOwHevEGGK1R4wy8fdGeS1J6y48tqqlXPdQcgvhmqzMmI9eEfhygRjTy7O5SgkoqDz1bEpEyoFjaz8vKYO4etZGU7xXPuHkkwD2VFzEfF53gTAGIeW6PYMZjnzEnLLuicTQrX0dy6Cqhhg5l6yIbtMlfZEGe6REj2B6v/GBeoq16/fGn2oxQHDSlbzFGiHTM0KURtMIc/f2zC0crTRJimQQO+R2Y2ys2+4AoP10GKgZK8gxGJljPOXURiXpIJc8WOAFwZYEMv4ZEWhUCY+lvag6P+4sMAczhKL4UMP1GgvotrUcOfRogk4dBg1prI7uGmiq4BHTSCdJPttV90FPKR8Nh5R8N+oAPaa9pkLBeoYUj75qhyIG7BXuhWqANGj4qZu8Gm3JDrShuLt3pStIBIkxR/1tqmsu41QH70eQPDw2NIFGgVJN7rNv1PeIGKUFUcRjDmvv+44p5ZWJpObZQZSFmupPjlfKdq9yl2a2poiHb/oxjp2JhMMZ600w8aRC9u Siq/HOQG wfkV2aDDv8oyADEerg4pLJ6bXv+W9Tx9iugTlwzM1+zCT5I75upj+cMOq4Z0uCs1S84F8+xRZFzhZYB/XIDBE6Ats8l57DRixjayUOeFYLtYvFhBXcyZYkbyjJRm+Fi8mAGTCuWZrwYl4mRsXGLG2CoBC84nxWdG/MzBmY7d7EyZfCJypxxaIiMw890+6hlTkLvUfjy2H2S+/7jvCsGLUUdILDJreNF4sb37c0XGUCKd5L6cKRKrfSRbqXZbEmuCu7KwucrkJhqAt82wWsZpOhBkBLL+bHA9wwSl5YSK5lWUZdKF1StphEXFuiEhGgTZULEXpCQiBTe/boDADaB8vPOuEsVA/bTDbFPRjNdg2P93prz7JUZNenTjqgz/txj12fKhMmV5wCb2f+edEWbYvRhMv/FjOAAEyG2jUS3ZVjABZ0/s/uEqIHSJJZojuEsnatkTOTDLCEKr5ed4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: While reviewing Alexandre Ghiti's "riscv: tlb flush improvements" series[1], I noticed that most TLB flush functions end up as a call to local_flush_tlb_all() when SMP is disabled. This series resolves that, and also optimizes the scenario where SMP is enabled but only one CPU is present or online. Along the way, I realized that we should be using single-ASID flushes wherever possible, so I implemented that as well. Here are some numbers from D1 (with SMP disabled) which show the performance impact: v6.8-rc6 + riscv/fixes + riscv/for-next: System Benchmarks Partial Index BASELINE RESULT INDEX Execl Throughput 43.0 223.5 52.0 File Copy 1024 bufsize 2000 maxblocks 3960.0 62563.4 158.0 File Copy 256 bufsize 500 maxblocks 1655.0 17869.2 108.0 File Copy 4096 bufsize 8000 maxblocks 5800.0 164915.1 284.3 Pipe Throughput 12440.0 161368.5 129.7 Pipe-based Context Switching 4000.0 22247.1 55.6 Process Creation 126.0 546.9 43.4 Shell Scripts (1 concurrent) 42.4 599.6 141.4 Shell Scripts (16 concurrent) --- 39.3 --- Shell Scripts (8 concurrent) 6.0 79.1 131.9 System Call Overhead 15000.0 246019.0 164.0 ======== System Benchmarks Index Score (Partial Only) 109.2 v6.8-rc6 + riscv/fixes + riscv/for-next + this patch series: System Benchmarks Partial Index BASELINE RESULT INDEX Execl Throughput 43.0 223.1 51.9 File Copy 1024 bufsize 2000 maxblocks 3960.0 71982.9 181.8 File Copy 256 bufsize 500 maxblocks 1655.0 18436.9 111.4 File Copy 4096 bufsize 8000 maxblocks 5800.0 184955.2 318.9 Pipe Throughput 12440.0 162622.9 130.7 Pipe-based Context Switching 4000.0 22082.5 55.2 Process Creation 126.0 546.4 43.4 Shell Scripts (1 concurrent) 42.4 598.1 141.1 Shell Scripts (16 concurrent) --- 38.8 --- Shell Scripts (8 concurrent) 6.0 78.6 131.0 System Call Overhead 15000.0 258529.3 172.4 ======== System Benchmarks Index Score (Partial Only) 112.8 [1]: https://lore.kernel.org/linux-riscv/20231030133027.19542-1-alexghiti@rivosinc.com/ Changes in v5: - Rebase on v6.8-rc1 + riscv/for-next (for the fast GUP implementation) - Add patch for minor refactoring in asm/pgalloc.h - Also switch to riscv_use_sbi_for_rfence() in asm/pgalloc.h - Leave use_asid_allocator declared in asm/mmu_context.h Changes in v4: - Fix a possible race between flush_icache_*() and SMP bringup - Refactor riscv_use_ipi_for_rfence() to make later changes cleaner - Optimize kernel TLB flushes with only one CPU online - Optimize global cache/TLB flushes with only one CPU online - Merge the two copies of __flush_tlb_range() and rely on the compiler to optimize out the broadcast path (both clang and gcc do this) - Merge the two copies of flush_tlb_all() and rely on constant folding - Only set tlb_flush_all_threshold when CONFIG_MMU=y. Changes in v3: - Fixed a performance regression caused by executing sfence.vma in a loop on implementations affected by SiFive CIP-1200 - Rebased on v6.7-rc1 Changes in v2: - Move the SMP/UP merge earlier in the series to avoid build issues - Make a copy of __flush_tlb_range() instead of adding ifdefs inside - local_flush_tlb_all() is the only function used on !MMU (smpboot.c) Samuel Holland (13): riscv: Flush the instruction cache during SMP bringup riscv: Factor out page table TLB synchronization riscv: Use IPIs for remote cache/TLB flushes by default riscv: mm: Broadcast kernel TLB flushes only when needed riscv: Only send remote fences when some other CPU is online riscv: mm: Combine the SMP and UP TLB flush code riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma riscv: Avoid TLB flush loops when affected by SiFive CIP-1200 riscv: mm: Introduce cntx2asid/cntx2version helper macros riscv: mm: Use a fixed layout for the MM context ID riscv: mm: Make asid_bits a local variable riscv: mm: Preserve global TLB entries when switching contexts riscv: mm: Always use an ASID to flush mm contexts arch/riscv/Kconfig | 2 +- arch/riscv/errata/sifive/errata.c | 5 ++ arch/riscv/include/asm/errata_list.h | 12 ++++- arch/riscv/include/asm/mmu.h | 3 ++ arch/riscv/include/asm/pgalloc.h | 32 ++++++------ arch/riscv/include/asm/sbi.h | 4 ++ arch/riscv/include/asm/smp.h | 15 +----- arch/riscv/include/asm/tlbflush.h | 51 ++++++++----------- arch/riscv/kernel/sbi-ipi.c | 11 +++- arch/riscv/kernel/smp.c | 11 +--- arch/riscv/kernel/smpboot.c | 7 +-- arch/riscv/mm/Makefile | 5 +- arch/riscv/mm/cacheflush.c | 7 +-- arch/riscv/mm/context.c | 23 ++++----- arch/riscv/mm/tlbflush.c | 75 ++++++++-------------------- drivers/clocksource/timer-clint.c | 2 +- 16 files changed, 114 insertions(+), 151 deletions(-)