From patchwork Sat Sep 11 02:29:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Christopher M. Riedl" X-Patchwork-Id: 12537663 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A4E3C433F5 for ; Sat, 11 Sep 2021 02:29:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6910D611CC for ; Sat, 11 Sep 2021 02:29:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235061AbhIKCag (ORCPT ); Fri, 10 Sep 2021 22:30:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231864AbhIKCaf (ORCPT ); Fri, 10 Sep 2021 22:30:35 -0400 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [IPv6:2001:67c:2050::465:202]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F25D0C061574 for ; Fri, 10 Sep 2021 19:29:23 -0700 (PDT) Received: from smtp102.mailbox.org (smtp102.mailbox.org [80.241.60.233]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4H5xX15kCwzQjdv; Sat, 11 Sep 2021 04:29:21 +0200 (CEST) X-Virus-Scanned: amavisd-new at heinlein-support.de From: "Christopher M. Riedl" To: linuxppc-dev@lists.ozlabs.org Cc: linux-hardening@vger.kernel.org Subject: [PATCH v6 1/4] powerpc/64s: Introduce temporary mm for Radix MMU Date: Fri, 10 Sep 2021 21:29:01 -0500 Message-Id: <20210911022904.30962-2-cmr@bluescreens.de> In-Reply-To: <20210911022904.30962-1-cmr@bluescreens.de> References: <20210911022904.30962-1-cmr@bluescreens.de> MIME-Version: 1.0 X-Rspamd-Queue-Id: A68E726F Precedence: bulk List-ID: X-Mailing-List: linux-hardening@vger.kernel.org x86 supports the notion of a temporary mm which restricts access to temporary PTEs to a single CPU. A temporary mm is useful for situations where a CPU needs to perform sensitive operations (such as patching a STRICT_KERNEL_RWX kernel) requiring temporary mappings without exposing said mappings to other CPUs. Another benefit is that other CPU TLBs do not need to be flushed when the temporary mm is torn down. Mappings in the temporary mm can be set in the userspace portion of the address-space. Interrupts must be disabled while the temporary mm is in use. HW breakpoints, which may have been set by userspace as watchpoints on addresses now within the temporary mm, are saved and disabled when loading the temporary mm. The HW breakpoints are restored when unloading the temporary mm. All HW breakpoints are indiscriminately disabled while the temporary mm is in use - this may include breakpoints set by perf. Based on x86 implementation: commit cefa929c034e ("x86/mm: Introduce temporary mm structs") Signed-off-by: Christopher M. Riedl --- v6: * Use {start,stop}_using_temporary_mm() instead of {use,unuse}_temporary_mm() as suggested by Christophe. v5: * Drop support for using a temporary mm on Book3s64 Hash MMU. v4: * Pass the prev mm instead of NULL to switch_mm_irqs_off() when using/unusing the temp mm as suggested by Jann Horn to keep the context.active counter in-sync on mm/nohash. * Disable SLB preload in the temporary mm when initializing the temp_mm struct. * Include asm/debug.h header to fix build issue with ppc44x_defconfig. --- arch/powerpc/include/asm/debug.h | 1 + arch/powerpc/kernel/process.c | 5 +++ arch/powerpc/lib/code-patching.c | 56 ++++++++++++++++++++++++++++++++ 3 files changed, 62 insertions(+) diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h index 86a14736c76c..dfd82635ea8b 100644 --- a/arch/powerpc/include/asm/debug.h +++ b/arch/powerpc/include/asm/debug.h @@ -46,6 +46,7 @@ static inline int debugger_fault_handler(struct pt_regs *regs) { return 0; } #endif void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk); +void __get_breakpoint(int nr, struct arch_hw_breakpoint *brk); bool ppc_breakpoint_available(void); #ifdef CONFIG_PPC_ADV_DEBUG_REGS extern void do_send_trap(struct pt_regs *regs, unsigned long address, diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 50436b52c213..6aa1f5c4d520 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -865,6 +865,11 @@ static inline int set_breakpoint_8xx(struct arch_hw_breakpoint *brk) return 0; } +void __get_breakpoint(int nr, struct arch_hw_breakpoint *brk) +{ + memcpy(brk, this_cpu_ptr(¤t_brk[nr]), sizeof(*brk)); +} + void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk) { memcpy(this_cpu_ptr(¤t_brk[nr]), brk, sizeof(*brk)); diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c index f9a3019e37b4..8d61a7d35b89 100644 --- a/arch/powerpc/lib/code-patching.c +++ b/arch/powerpc/lib/code-patching.c @@ -17,6 +17,9 @@ #include #include #include +#include +#include +#include static int __patch_instruction(u32 *exec_addr, struct ppc_inst instr, u32 *patch_addr) { @@ -45,6 +48,59 @@ int raw_patch_instruction(u32 *addr, struct ppc_inst instr) } #ifdef CONFIG_STRICT_KERNEL_RWX + +struct temp_mm { + struct mm_struct *temp; + struct mm_struct *prev; + struct arch_hw_breakpoint brk[HBP_NUM_MAX]; +}; + +static inline void init_temp_mm(struct temp_mm *temp_mm, struct mm_struct *mm) +{ + /* We currently only support temporary mm on the Book3s64 Radix MMU */ + WARN_ON(!radix_enabled()); + + temp_mm->temp = mm; + temp_mm->prev = NULL; + memset(&temp_mm->brk, 0, sizeof(temp_mm->brk)); +} + +static inline void start_using_temporary_mm(struct temp_mm *temp_mm) +{ + lockdep_assert_irqs_disabled(); + + temp_mm->prev = current->active_mm; + switch_mm_irqs_off(temp_mm->prev, temp_mm->temp, current); + + WARN_ON(!mm_is_thread_local(temp_mm->temp)); + + if (ppc_breakpoint_available()) { + struct arch_hw_breakpoint null_brk = {0}; + int i = 0; + + for (; i < nr_wp_slots(); ++i) { + __get_breakpoint(i, &temp_mm->brk[i]); + if (temp_mm->brk[i].type != 0) + __set_breakpoint(i, &null_brk); + } + } +} + +static inline void stop_using_temporary_mm(struct temp_mm *temp_mm) +{ + lockdep_assert_irqs_disabled(); + + switch_mm_irqs_off(temp_mm->temp, temp_mm->prev, current); + + if (ppc_breakpoint_available()) { + int i = 0; + + for (; i < nr_wp_slots(); ++i) + if (temp_mm->brk[i].type != 0) + __set_breakpoint(i, &temp_mm->brk[i]); + } +} + static DEFINE_PER_CPU(struct vm_struct *, text_poke_area); static int text_area_cpu_up(unsigned int cpu) From patchwork Sat Sep 11 02:29:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Christopher M. Riedl" X-Patchwork-Id: 12537665 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED686C433FE for ; Sat, 11 Sep 2021 02:29:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C506660F6D for ; Sat, 11 Sep 2021 02:29:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235124AbhIKCai (ORCPT ); Fri, 10 Sep 2021 22:30:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231864AbhIKCah (ORCPT ); Fri, 10 Sep 2021 22:30:37 -0400 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [IPv6:2001:67c:2050::465:201]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB6B4C061574 for ; Fri, 10 Sep 2021 19:29:25 -0700 (PDT) Received: from smtp102.mailbox.org (smtp102.mailbox.org [80.241.60.233]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4H5xX35sF5zQkBc; Sat, 11 Sep 2021 04:29:23 +0200 (CEST) X-Virus-Scanned: amavisd-new at heinlein-support.de From: "Christopher M. Riedl" To: linuxppc-dev@lists.ozlabs.org Cc: linux-hardening@vger.kernel.org Subject: [PATCH v6 2/4] powerpc: Rework and improve STRICT_KERNEL_RWX patching Date: Fri, 10 Sep 2021 21:29:02 -0500 Message-Id: <20210911022904.30962-3-cmr@bluescreens.de> In-Reply-To: <20210911022904.30962-1-cmr@bluescreens.de> References: <20210911022904.30962-1-cmr@bluescreens.de> MIME-Version: 1.0 X-Rspamd-Queue-Id: BDE8826E Precedence: bulk List-ID: X-Mailing-List: linux-hardening@vger.kernel.org Rework code-patching with STRICT_KERNEL_RWX to prepare for a later patch which uses a temporary mm for patching under the Book3s64 Radix MMU. Make improvements by adding a WARN_ON when the patchsite doesn't match after patching and return the error from __patch_instruction() properly. Signed-off-by: Christopher M. Riedl --- v6: * Remove the pr_warn() message from unmap_patch_area(). v5: * New to series. --- arch/powerpc/lib/code-patching.c | 35 ++++++++++++++++---------------- 1 file changed, 17 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c index 8d61a7d35b89..8d0bb86125d5 100644 --- a/arch/powerpc/lib/code-patching.c +++ b/arch/powerpc/lib/code-patching.c @@ -102,6 +102,7 @@ static inline void stop_using_temporary_mm(struct temp_mm *temp_mm) } static DEFINE_PER_CPU(struct vm_struct *, text_poke_area); +static DEFINE_PER_CPU(unsigned long, cpu_patching_addr); static int text_area_cpu_up(unsigned int cpu) { @@ -114,6 +115,7 @@ static int text_area_cpu_up(unsigned int cpu) return -1; } this_cpu_write(text_poke_area, area); + this_cpu_write(cpu_patching_addr, (unsigned long)area->addr); return 0; } @@ -139,7 +141,7 @@ void __init poking_init(void) /* * This can be called for kernel text or a module. */ -static int map_patch_area(void *addr, unsigned long text_poke_addr) +static int map_patch_area(void *addr) { unsigned long pfn; int err; @@ -149,17 +151,20 @@ static int map_patch_area(void *addr, unsigned long text_poke_addr) else pfn = __pa_symbol(addr) >> PAGE_SHIFT; - err = map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL); + err = map_kernel_page(__this_cpu_read(cpu_patching_addr), + (pfn << PAGE_SHIFT), PAGE_KERNEL); - pr_devel("Mapped addr %lx with pfn %lx:%d\n", text_poke_addr, pfn, err); + pr_devel("Mapped addr %lx with pfn %lx:%d\n", + __this_cpu_read(cpu_patching_addr), pfn, err); if (err) return -1; return 0; } -static inline int unmap_patch_area(unsigned long addr) +static inline int unmap_patch_area(void) { + unsigned long addr = __this_cpu_read(cpu_patching_addr); pte_t *ptep; pmd_t *pmdp; pud_t *pudp; @@ -199,11 +204,9 @@ static inline int unmap_patch_area(unsigned long addr) static int do_patch_instruction(u32 *addr, struct ppc_inst instr) { - int err; + int err, rc = 0; u32 *patch_addr = NULL; unsigned long flags; - unsigned long text_poke_addr; - unsigned long kaddr = (unsigned long)addr; /* * During early early boot patch_instruction is called @@ -215,24 +218,20 @@ static int do_patch_instruction(u32 *addr, struct ppc_inst instr) local_irq_save(flags); - text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr; - if (map_patch_area(addr, text_poke_addr)) { - err = -1; + err = map_patch_area(addr); + if (err) goto out; - } - patch_addr = (u32 *)(text_poke_addr + (kaddr & ~PAGE_MASK)); + patch_addr = (u32 *)(__this_cpu_read(cpu_patching_addr) | offset_in_page(addr)); + rc = __patch_instruction(addr, instr, patch_addr); - __patch_instruction(addr, instr, patch_addr); - - err = unmap_patch_area(text_poke_addr); - if (err) - pr_warn("failed to unmap %lx\n", text_poke_addr); + err = unmap_patch_area(); out: local_irq_restore(flags); + WARN_ON(!ppc_inst_equal(ppc_inst_read(addr), instr)); - return err; + return rc ? rc : err; } #else /* !CONFIG_STRICT_KERNEL_RWX */ From patchwork Sat Sep 11 02:29:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Christopher M. Riedl" X-Patchwork-Id: 12537667 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A982C433EF for ; Sat, 11 Sep 2021 02:29:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6122E611CC for ; Sat, 11 Sep 2021 02:29:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235149AbhIKCak (ORCPT ); Fri, 10 Sep 2021 22:30:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231864AbhIKCaj (ORCPT ); Fri, 10 Sep 2021 22:30:39 -0400 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [IPv6:2001:67c:2050::465:102]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 07A43C061574 for ; Fri, 10 Sep 2021 19:29:27 -0700 (PDT) Received: from smtp102.mailbox.org (smtp102.mailbox.org [80.241.60.233]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4H5xX5662jzQk95; Sat, 11 Sep 2021 04:29:25 +0200 (CEST) X-Virus-Scanned: amavisd-new at heinlein-support.de From: "Christopher M. Riedl" To: linuxppc-dev@lists.ozlabs.org Cc: linux-hardening@vger.kernel.org Subject: [PATCH v6 3/4] powerpc: Use WARN_ON and fix check in poking_init Date: Fri, 10 Sep 2021 21:29:03 -0500 Message-Id: <20210911022904.30962-4-cmr@bluescreens.de> In-Reply-To: <20210911022904.30962-1-cmr@bluescreens.de> References: <20210911022904.30962-1-cmr@bluescreens.de> MIME-Version: 1.0 X-Rspamd-Queue-Id: 9DE9826E Precedence: bulk List-ID: X-Mailing-List: linux-hardening@vger.kernel.org The latest kernel docs list BUG_ON() as 'deprecated' and that they should be replaced with WARN_ON() (or pr_warn()) when possible. The BUG_ON() in poking_init() warrants a WARN_ON() rather than a pr_warn() since the error condition is deemed "unreachable". Also take this opportunity to fix the failure check in the WARN_ON(): cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, ...) returns a positive integer on success and a negative integer on failure. Signed-off-by: Christopher M. Riedl --- v6: * New to series - based on Christophe's relentless feedback in the crusade against BUG_ON()s :) --- arch/powerpc/lib/code-patching.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c index 8d0bb86125d5..e802e42c2789 100644 --- a/arch/powerpc/lib/code-patching.c +++ b/arch/powerpc/lib/code-patching.c @@ -126,16 +126,11 @@ static int text_area_cpu_down(unsigned int cpu) return 0; } -/* - * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and - * we judge it as being preferable to a kernel that will crash later when - * someone tries to use patch_instruction(). - */ void __init poking_init(void) { - BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, + WARN_ON(cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "powerpc/text_poke:online", text_area_cpu_up, - text_area_cpu_down)); + text_area_cpu_down) < 0); } /* From patchwork Sat Sep 11 02:29:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Christopher M. Riedl" X-Patchwork-Id: 12537669 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F13FC433F5 for ; Sat, 11 Sep 2021 02:29:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0D6DD611CC for ; Sat, 11 Sep 2021 02:29:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231864AbhIKCam (ORCPT ); Fri, 10 Sep 2021 22:30:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48832 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235155AbhIKCal (ORCPT ); Fri, 10 Sep 2021 22:30:41 -0400 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [IPv6:2001:67c:2050::465:102]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78E1CC061574 for ; Fri, 10 Sep 2021 19:29:29 -0700 (PDT) Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:105:465:1:3:0]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4H5xX81GbvzQk9n; Sat, 11 Sep 2021 04:29:28 +0200 (CEST) X-Virus-Scanned: amavisd-new at heinlein-support.de From: "Christopher M. Riedl" To: linuxppc-dev@lists.ozlabs.org Cc: linux-hardening@vger.kernel.org Subject: [PATCH v6 4/4] powerpc/64s: Initialize and use a temporary mm for patching on Radix Date: Fri, 10 Sep 2021 21:29:04 -0500 Message-Id: <20210911022904.30962-5-cmr@bluescreens.de> In-Reply-To: <20210911022904.30962-1-cmr@bluescreens.de> References: <20210911022904.30962-1-cmr@bluescreens.de> MIME-Version: 1.0 X-Rspamd-Queue-Id: 1293F26B Precedence: bulk List-ID: X-Mailing-List: linux-hardening@vger.kernel.org When code patching a STRICT_KERNEL_RWX kernel the page containing the address to be patched is temporarily mapped as writeable. Currently, a per-cpu vmalloc patch area is used for this purpose. While the patch area is per-cpu, the temporary page mapping is inserted into the kernel page tables for the duration of patching. The mapping is exposed to CPUs other than the patching CPU - this is undesirable from a hardening perspective. Use a temporary mm instead which keeps the mapping local to the CPU doing the patching. Use the `poking_init` init hook to prepare a temporary mm and patching address. Initialize the temporary mm by copying the init mm. Choose a randomized patching address inside the temporary mm userspace address space. The patching address is randomized between PAGE_SIZE and DEFAULT_MAP_WINDOW-PAGE_SIZE. Bits of entropy with 64K page size on BOOK3S_64: bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE) PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB bits of entropy = log2(128TB / 64K) bits of entropy = 31 The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash MMU operates - by default the space above DEFAULT_MAP_WINDOW is not available. Currently the Hash MMU does not use a temporary mm so technically this upper limit isn't necessary; however, a larger randomization range does not further "harden" this overall approach and future work may introduce patching with a temporary mm on Hash as well. Randomization occurs only once during initialization at boot for each possible CPU in the system. Introduce two new functions, map_patch_mm() and unmap_patch_mm(), to respectively create and remove the temporary mapping with write permissions at patching_addr. Map the page with PAGE_KERNEL to set EAA[0] for the PTE which ignores the AMR (so no need to unlock/lock KUAP) according to PowerISA v3.0b Figure 35 on Radix. Based on x86 implementation: commit 4fc19708b165 ("x86/alternatives: Initialize temporary mm for patching") and: commit b3fd8e83ada0 ("x86/alternatives: Use temporary mm for text poking") Signed-off-by: Christopher M. Riedl --- v6: * Small clean-ups (naming, formatting, style, etc). * Call stop_using_temporary_mm() before pte_unmap_unlock() after patching. * Replace BUG_ON()s in poking_init() w/ WARN_ON()s. v5: * Only support Book3s64 Radix MMU for now. * Use a per-cpu datastructure to hold the patching_addr and patching_mm to avoid the need for a synchronization lock/mutex. v4: * In the previous series this was two separate patches: one to init the temporary mm in poking_init() (unused in powerpc at the time) and the other to use it for patching (which removed all the per-cpu vmalloc code). Now that we use poking_init() in the existing per-cpu vmalloc approach, that separation doesn't work as nicely anymore so I just merged the two patches into one. * Preload the SLB entry and hash the page for the patching_addr when using Hash on book3s64 to avoid taking an SLB and Hash fault during patching. The previous implementation was a hack which changed current->mm to allow the SLB and Hash fault handlers to work with the temporary mm since both of those code-paths always assume mm == current->mm. * Also (hmm - seeing a trend here) with the book3s64 Hash MMU we have to manage the mm->context.active_cpus counter and mm cpumask since they determine (via mm_is_thread_local()) if the TLB flush in pte_clear() is local or not - it should always be local when we're using the temporary mm. On book3s64's Radix MMU we can just call local_flush_tlb_mm(). * Use HPTE_USE_KERNEL_KEY on Hash to avoid costly lock/unlock of KUAP. --- arch/powerpc/lib/code-patching.c | 119 +++++++++++++++++++++++++++++-- 1 file changed, 112 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c index e802e42c2789..af8e2a02a9dd 100644 --- a/arch/powerpc/lib/code-patching.c +++ b/arch/powerpc/lib/code-patching.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include @@ -103,6 +104,7 @@ static inline void stop_using_temporary_mm(struct temp_mm *temp_mm) static DEFINE_PER_CPU(struct vm_struct *, text_poke_area); static DEFINE_PER_CPU(unsigned long, cpu_patching_addr); +static DEFINE_PER_CPU(struct mm_struct *, cpu_patching_mm); static int text_area_cpu_up(unsigned int cpu) { @@ -126,8 +128,48 @@ static int text_area_cpu_down(unsigned int cpu) return 0; } +static __always_inline void __poking_init_temp_mm(void) +{ + int cpu; + spinlock_t *ptl; /* for protecting pte table */ + pte_t *ptep; + struct mm_struct *patching_mm; + unsigned long patching_addr; + + for_each_possible_cpu(cpu) { + patching_mm = copy_init_mm(); + WARN_ON(!patching_mm); + per_cpu(cpu_patching_mm, cpu) = patching_mm; + + /* + * Choose a randomized, page-aligned address from the range: + * [PAGE_SIZE, DEFAULT_MAP_WINDOW - PAGE_SIZE] The lower + * address bound is PAGE_SIZE to avoid the zero-page. The + * upper address bound is DEFAULT_MAP_WINDOW - PAGE_SIZE to + * stay under DEFAULT_MAP_WINDOW with the Book3s64 Hash MMU. + */ + patching_addr = PAGE_SIZE + ((get_random_long() & PAGE_MASK) + % (DEFAULT_MAP_WINDOW - 2 * PAGE_SIZE)); + per_cpu(cpu_patching_addr, cpu) = patching_addr; + + /* + * PTE allocation uses GFP_KERNEL which means we need to + * pre-allocate the PTE here because we cannot do the + * allocation during patching when IRQs are disabled. + */ + ptep = get_locked_pte(patching_mm, patching_addr, &ptl); + WARN_ON(!ptep); + pte_unmap_unlock(ptep, ptl); + } +} + void __init poking_init(void) { + if (radix_enabled()) { + __poking_init_temp_mm(); + return; + } + WARN_ON(cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "powerpc/text_poke:online", text_area_cpu_up, text_area_cpu_down) < 0); @@ -197,30 +239,93 @@ static inline int unmap_patch_area(void) return 0; } +struct patch_mapping { + spinlock_t *ptl; /* for protecting pte table */ + pte_t *ptep; + struct temp_mm temp_mm; +}; + +/* + * This can be called for kernel text or a module. + */ +static int map_patch_mm(const void *addr, struct patch_mapping *patch_mapping) +{ + struct page *page; + struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm); + unsigned long patching_addr = __this_cpu_read(cpu_patching_addr); + + if (is_vmalloc_or_module_addr(addr)) + page = vmalloc_to_page(addr); + else + page = virt_to_page(addr); + + patch_mapping->ptep = get_locked_pte(patching_mm, patching_addr, + &patch_mapping->ptl); + if (unlikely(!patch_mapping->ptep)) { + pr_warn("map patch: failed to allocate pte for patching\n"); + return -1; + } + + set_pte_at(patching_mm, patching_addr, patch_mapping->ptep, + pte_mkdirty(mk_pte(page, PAGE_KERNEL))); + + init_temp_mm(&patch_mapping->temp_mm, patching_mm); + start_using_temporary_mm(&patch_mapping->temp_mm); + + return 0; +} + +static int unmap_patch_mm(struct patch_mapping *patch_mapping) +{ + struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm); + unsigned long patching_addr = __this_cpu_read(cpu_patching_addr); + + pte_clear(patching_mm, patching_addr, patch_mapping->ptep); + + local_flush_tlb_mm(patching_mm); + stop_using_temporary_mm(&patch_mapping->temp_mm); + + pte_unmap_unlock(patch_mapping->ptep, patch_mapping->ptl); + + return 0; +} + static int do_patch_instruction(u32 *addr, struct ppc_inst instr) { int err, rc = 0; u32 *patch_addr = NULL; unsigned long flags; + struct patch_mapping patch_mapping; /* - * During early early boot patch_instruction is called - * when text_poke_area is not ready, but we still need - * to allow patching. We just do the plain old patching + * During early early boot patch_instruction is called when the + * patching_mm/text_poke_area is not ready, but we still need to allow + * patching. We just do the plain old patching. */ - if (!this_cpu_read(text_poke_area)) - return raw_patch_instruction(addr, instr); + if (radix_enabled()) { + if (!this_cpu_read(cpu_patching_mm)) + return raw_patch_instruction(addr, instr); + } else { + if (!this_cpu_read(text_poke_area)) + return raw_patch_instruction(addr, instr); + } local_irq_save(flags); - err = map_patch_area(addr); + if (radix_enabled()) + err = map_patch_mm(addr, &patch_mapping); + else + err = map_patch_area(addr); if (err) goto out; patch_addr = (u32 *)(__this_cpu_read(cpu_patching_addr) | offset_in_page(addr)); rc = __patch_instruction(addr, instr, patch_addr); - err = unmap_patch_area(); + if (radix_enabled()) + err = unmap_patch_mm(&patch_mapping); + else + err = unmap_patch_area(); out: local_irq_restore(flags);