From patchwork Thu Nov 7 20:20:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13867066 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 724BAD5D689 for ; Thu, 7 Nov 2024 20:20:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E78456B00AD; Thu, 7 Nov 2024 15:20:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DD8626B00AF; Thu, 7 Nov 2024 15:20:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C02316B00B0; Thu, 7 Nov 2024 15:20:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A180E6B00AD for ; Thu, 7 Nov 2024 15:20:50 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 3384340156 for ; Thu, 7 Nov 2024 20:20:50 +0000 (UTC) X-FDA: 82760415654.13.E64CC27 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf12.hostedemail.com (Postfix) with ESMTP id E86634001D for ; Thu, 7 Nov 2024 20:20:31 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0G4MKvPc; spf=pass (imf12.hostedemail.com: domain of 3HyEtZwYKCFYMIN5yC4CC492.0CA96BIL-AA8Jy08.CF4@flex--yuzhao.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3HyEtZwYKCFYMIN5yC4CC492.0CA96BIL-AA8Jy08.CF4@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731010797; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KhIcSe1txNfkCnP7MPXpnY6tOWGMY/4eAudpBxJaXiU=; b=eFvD9bDI5X2Gq26O0N83NykebQ7URp8aD5XX93KEuuktgcT88eJ0eFJhPR5up8D85m13qx fU2YjNvJBQ3FvRyVJOYP5zpXXi3TtU7OVQ+Kh4IX+Y6bJC2bNPvCsEpImW4+wU/zisu21p dEsT9L77ObcxP8+QYYOt7kiJvKLWlWE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731010797; a=rsa-sha256; cv=none; b=zo7c9QqbCuKCeM61NCWZPwSC+HBkSJPl0N+nh2Ny9cHq4BYDnoshInn5SXFCruBh8iAuVe YXk0OmhMqNHPXynHjW1RKFmomffze+VXeiEsU8OVQtTJ8XvVwvxa5OPSljLx0VFdZ8pGZr isYyFBYWVJ7Z9fMbpggbjAFvV792FDc= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0G4MKvPc; spf=pass (imf12.hostedemail.com: domain of 3HyEtZwYKCFYMIN5yC4CC492.0CA96BIL-AA8Jy08.CF4@flex--yuzhao.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3HyEtZwYKCFYMIN5yC4CC492.0CA96BIL-AA8Jy08.CF4@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e30c7a7ca60so1791017276.0 for ; Thu, 07 Nov 2024 12:20:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731010847; x=1731615647; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KhIcSe1txNfkCnP7MPXpnY6tOWGMY/4eAudpBxJaXiU=; b=0G4MKvPcSPX7LCBggX11ARGkdIXbE8nROL41TkhQl4y8lSL7IrAJwkLJ++79U/q7sp BZYBUbsDVb06j71b/18DXRpNbi6D2VUsADL/SysqL24iHRJkQEPSkMay/qDX8GZFOdTW /pvqs+RyyFrHAu+TnbX2uYVpPnmDA8J3zs468EsZZQfW4QWkwlxWONr9/roDF5vpgbhb jPZOvjLBnlOqjsirkJdaDkkKC1vb4iv+LRq/XJJJumWwiSS3lb/KtYKSQogwnBx2jvbW PvGxl4bNsiJ2TgXgwttWilZX+URtRvVavXKlKUPH6G7U3QSxPAisadAn9T9+QRDAw3yL XZpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731010847; x=1731615647; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KhIcSe1txNfkCnP7MPXpnY6tOWGMY/4eAudpBxJaXiU=; b=Jdl3ApJxCeRzRex5LVdly8wimABQBA0QXhmkf9vz3sGTl7h9dGADfRl+FRKRlYZv2q pM1nuYtHGuHQecfAk+ELFunnb2r3SG+EYC0U3JH48fgPavaNi0SKEmg8hRVhm3TC6L/A 7L4o/tX7E6Re4RCrK2EMZJhpiKKs9EYGx+SFu2xfGhYmbEY2gFvkE3dfU8l8QINCfqrG nWcxnYLxZbVXxpULTCTzcNjOVMapv5dEvPhlyOd2DKJFfrl64pl/5ZHWPRYu2d0oh6H+ +ieE267ju5rK1O6bZkriwXyYAdWSkk8LC72CwGjk8hc+EhNIfuK9R7reVC5lPKBu2fgw wtXw== X-Forwarded-Encrypted: i=1; AJvYcCXWAugBhtRvMTmmoRh1h/AxZ5oZNJYL99XqJRhz9/mTsKwsyTQZj2ilevWgSp1JsDYwtjHy0ucrLw==@kvack.org X-Gm-Message-State: AOJu0YzGOy9qHPY1601T9RKn/QyLhq1n9XsRNsGM5dtHh1goBR6RG/Vo rBIC7KfWiNhBV3cw2kMGc0gU8OW3UBUNfBscM2nsxgELi+qOxGshXRl+ZlHSewEJNAGK8GpWQVd ibw== X-Google-Smtp-Source: AGHT+IEnJ0e357/3OotzjsP9lexlQ5CBUwlEpVYwPI3w1zjYPdS8y4hVyHrOYLJ8xSQLJ2NE0kbCcYElb98= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:a4c0:c64f:6cdd:91f8]) (user=yuzhao job=sendgmr) by 2002:a25:bc84:0:b0:e2b:d0e9:1cdc with SMTP id 3f1490d57ef6-e337f908f8dmr320276.10.1731010847379; Thu, 07 Nov 2024 12:20:47 -0800 (PST) Date: Thu, 7 Nov 2024 13:20:31 -0700 In-Reply-To: <20241107202033.2721681-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241107202033.2721681-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241107202033.2721681-5-yuzhao@google.com> Subject: [PATCH v2 4/6] arm64: broadcast IPIs to pause remote CPUs From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Stat-Signature: rd8r6dhz85ecuc3eo6mghb6bthci6c4e X-Rspam-User: X-Rspamd-Queue-Id: E86634001D X-Rspamd-Server: rspam02 X-HE-Tag: 1731010831-843619 X-HE-Meta: U2FsdGVkX19F/eliKnZCmnBJf6PrZteX7z+0v0KE6h7bDjbEEuzj/iz4JI4ZPhqcE6QiZkFAA/Gn6LIlFsvXk6U86iD5fjFJ9rKJwoKR1Z/9KF4c/svTZR7pAGxIoAsjjiPKX8I3/slUWRsX8uYwHZbSxzsuBb/o9RMGAvnvFA8e+FILjQYlUu8pIrsQJ8nyjYNj/3xHPqsBYRCdhFjnceOxxc9jueUFIF/vAPAAimYyh7hFrXzgXpGK+LYhEl0IQgBalFE8tBwEl4Uar+TJXdgXnWiEFfjCOsGQSheE+V9qcV/o3zHW1Y2IEKGliQkcUK53Hap4OHOXr5UU2yIvFq6Z9/qcpEU2+pFnPJPWJIuLDyxGBc1uJIaPPsXXu2GPhQyYcOFUVlqySksGir/EtC9/x8U7sCuyWxAtRzi5D8UORepiq+0Q0Yh/AivvChRjiHPxx8socb1MS4caPCf//IbRShRvp959YqhHYUMbAEF5mMtM4KEfpzrDeTMvKyBe0d+5k90F0Kz4WuLQNOljnVP1e9W5pZ5HhTWFaXnGs+K54LLyeIgC9hI9Bu0o8vXPbysqApAdfTgup+HuOMcgWMVpX9lDKOtyHNRyweHI+7FyQq6KAK2gIML4NTn3ulfZ67X8x+4clyZ5VHeh2dcvMfGllnPFoNcjxj+P0kqSzSXTG9ry5QJvaP6jAMy/IOlUyfoLs9HeUJ8u7pzcXpvQaAdUFsolveDABI6oqy3o9ge33FQd5nKzSAd0UZUwN8YueJD34djV9xTpdn1Dh9WdgtHTB7IMvOWOpG3Czull9gbU72bcDAO5uvAPgyY3QtNtrl611PzXNMV2vNUqPXRJtv8IbNvrZckOW7/uV6p8t/aZIn0v/H2mJb8P/H7PmaoG0weOb/y3cZcLCoWfjPDIJSUr8UmeIyXJGfUzlFtbziulQALxJjtWloa+lW6ekPoc5mm8Tjw0VtiooH8QTod z66dXjbt xPdsNeqlcavTjCANqspQyD5p2URsnI7pWNftuF8qWvjY3hfyWn3TfwteQbuv1tzi8+F2yTO+ToXWwYW9Je5E03N8jFQYiulc/KA11tyJzBe7H8LZtWo+HUrJGt+Dp49x0DQ7Q+RWQ/O3mmrKMShMWYR7+wiI8iancUGdI+/gIRo7EW+LfHyU0UmEFlevYH7MeK6378iyFa0QE4QCDR6ffg0KxWSFS9f1aK1ERiBhGQeeHqxWowkMdP0r5XS2kwrQTfi/yShKSFhDJZRSAunZfze2HQijCBuHyiSbkzr63QWHZ5wih7G+ySKsPcu3sOgvsTOJCOkPCgMm2+wWdxs+5AHB1vJs2LV4SByQSm9iIRrl5/uxWZsrRX9ganwSltHeH9gMnZO/dUcl3Z+rlLqQa/husBX+nHs7UoJfj9rO29Iki3KxWK/B36omSSw+2I0hM30nMIL7jC322XneoxZKW8oDLuaHnR8Vd2p8iCY0Wjwfwk07YO1jZTUjPbY2rBTU/QIZHSQ5c9O0wszmF81PSIDCA04248mYpjPuGKaOv+1q1LUE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Broadcast pseudo-NMI IPIs to pause remote CPUs for a short period of time, and then reliably resume them when the local CPU exits critical sections that preclude the execution of remote CPUs. A typical example of such critical sections is BBM on kernel PTEs. HugeTLB Vmemmap Optimization (HVO) on arm64 was disabled by commit 060a2c92d1b6 ("arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP") due to the folllowing reason: This is deemed UNPREDICTABLE by the Arm architecture without a break-before-make sequence (make the PTE invalid, TLBI, write the new valid PTE). However, such sequence is not possible since the vmemmap may be concurrently accessed by the kernel. Supporting BBM on kernel PTEs is one of the approaches that can make HVO safe on arm64. Signed-off-by: Yu Zhao --- arch/arm64/include/asm/smp.h | 3 ++ arch/arm64/kernel/smp.c | 85 +++++++++++++++++++++++++++++++++--- 2 files changed, 81 insertions(+), 7 deletions(-) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index 2510eec026f7..cffb0cfed961 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -133,6 +133,9 @@ bool cpus_are_stuck_in_kernel(void); extern void crash_smp_send_stop(void); extern bool smp_crash_stop_failed(void); +void pause_remote_cpus(void); +void resume_remote_cpus(void); + #endif /* ifndef __ASSEMBLY__ */ #endif /* ifndef __ASM_SMP_H */ diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 3b3f6b56e733..54e9f6374aa3 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -85,7 +85,12 @@ static int ipi_irq_base __ro_after_init; static int nr_ipi __ro_after_init = NR_IPI; static struct irq_desc *ipi_desc[MAX_IPI] __ro_after_init; -static bool crash_stop; +enum { + SEND_STOP, + CRASH_STOP, +}; + +static unsigned long stop_in_progress; static void ipi_setup(int cpu); @@ -917,6 +922,72 @@ static void __noreturn ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs #endif } +static DEFINE_RAW_SPINLOCK(cpu_pause_lock); +static bool __cacheline_aligned_in_smp cpu_paused; +static atomic_t __cacheline_aligned_in_smp nr_cpus_paused; + +static void pause_local_cpu(void) +{ + atomic_inc(&nr_cpus_paused); + + while (READ_ONCE(cpu_paused)) + cpu_relax(); + + atomic_dec(&nr_cpus_paused); + + /* + * The caller of resume_remote_cpus() should make sure that clearing + * cpu_paused is ordered after other changes that can have any impact on + * this CPU. The isb() below makes sure this CPU doesn't speculatively + * execute the next instruction before it sees all those changes. + */ + isb(); +} + +void pause_remote_cpus(void) +{ + cpumask_t cpus_to_pause; + int nr_cpus_to_pause = num_online_cpus() - 1; + + lockdep_assert_cpus_held(); + lockdep_assert_preemption_disabled(); + + if (!nr_cpus_to_pause) + return; + + cpumask_copy(&cpus_to_pause, cpu_online_mask); + cpumask_clear_cpu(smp_processor_id(), &cpus_to_pause); + + raw_spin_lock(&cpu_pause_lock); + + WARN_ON_ONCE(cpu_paused); + WARN_ON_ONCE(atomic_read(&nr_cpus_paused)); + + cpu_paused = true; + + smp_cross_call(&cpus_to_pause, IPI_CPU_STOP_NMI); + + while (atomic_read(&nr_cpus_paused) != nr_cpus_to_pause) + cpu_relax(); + + raw_spin_unlock(&cpu_pause_lock); +} + +void resume_remote_cpus(void) +{ + if (!cpu_paused) + return; + + raw_spin_lock(&cpu_pause_lock); + + WRITE_ONCE(cpu_paused, false); + + while (atomic_read(&nr_cpus_paused)) + cpu_relax(); + + raw_spin_unlock(&cpu_pause_lock); +} + static void arm64_backtrace_ipi(cpumask_t *mask) { __ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask); @@ -970,7 +1041,9 @@ static void do_handle_IPI(int ipinr) case IPI_CPU_STOP: case IPI_CPU_STOP_NMI: - if (IS_ENABLED(CONFIG_KEXEC_CORE) && crash_stop) { + if (!test_bit(SEND_STOP, &stop_in_progress)) { + pause_local_cpu(); + } else if (test_bit(CRASH_STOP, &stop_in_progress)) { ipi_cpu_crash_stop(cpu, get_irq_regs()); unreachable(); } else { @@ -1142,7 +1215,6 @@ static inline unsigned int num_other_online_cpus(void) void smp_send_stop(void) { - static unsigned long stop_in_progress; cpumask_t mask; unsigned long timeout; @@ -1154,7 +1226,7 @@ void smp_send_stop(void) goto skip_ipi; /* Only proceed if this is the first CPU to reach this code */ - if (test_and_set_bit(0, &stop_in_progress)) + if (test_and_set_bit(SEND_STOP, &stop_in_progress)) return; /* @@ -1230,12 +1302,11 @@ void crash_smp_send_stop(void) * This function can be called twice in panic path, but obviously * we execute this only once. * - * We use this same boolean to tell whether the IPI we send was a + * We use the CRASH_STOP bit to tell whether the IPI we send was a * stop or a "crash stop". */ - if (crash_stop) + if (test_and_set_bit(CRASH_STOP, &stop_in_progress)) return; - crash_stop = 1; smp_send_stop();