From patchwork Mon May 22 20:33:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiri Olsa X-Patchwork-Id: 13251068 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB021154AF for ; Mon, 22 May 2023 20:34:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E84C7C4339B; Mon, 22 May 2023 20:34:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684787657; bh=CR+uhegPGZnQF02rO5sFeA05nPiC/9Ec9ZWdVQpJmlw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NjQ3hoh4ha/iUxGPxWNu2SKDmxUuYh2SJImlsMDK6q4UgG9rUr+NvWycE7mmGS+KS F49spCPs8lRMeEvjtQGnrxGDpIMA4mmy+4c/Yg4284v9IcvjYfPYh+M9h96WncZQJs J5l2z2jI58XzRx3J0RgJri5RQAXsIZHp/5O/KsrTXuQ5fkLZQMvqP8JevAS4dla65V 5dl3ag7vA6cV+5QG9PKivZlH5/OQ1KdF7T7TADHIxMPD/llQrpdeUTAbTQNaXJBEMw 8ECIdVo8BFvcMeU2CVqzlkY0Wq7rQXbFb4CBCMVEJSxra1PpTKb4BpN26ERuKR2Uim RMy642v/iMArg== From: Jiri Olsa To: stable@vger.kernel.org Cc: Linus Torvalds , Masami Hiramatsu , x86@kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Tsahee Zidenberg , Andrii Nakryiko , Christoph Hellwig , Daniel Borkmann , Thomas Gleixner , =?utf-8?q?Mah=C3=A9_Tardy?= , linux-arm-kernel@lists.infradead.org Subject: [RFC PATCH stable 5.4 1/8] uaccess: Add strict non-pagefault kernel-space read function Date: Mon, 22 May 2023 22:33:45 +0200 Message-Id: <20230522203352.738576-2-jolsa@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230522203352.738576-1-jolsa@kernel.org> References: <20230522203352.738576-1-jolsa@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-State: RFC From: Daniel Borkmann commit 75a1a607bb7e6d918be3aca11ec2214a275392f4 upstream. Add two new probe_kernel_read_strict() and strncpy_from_unsafe_strict() helpers which by default alias to the __probe_kernel_read() and the __strncpy_from_unsafe(), respectively, but can be overridden by archs which have non-overlapping address ranges for kernel space and user space in order to bail out with -EFAULT when attempting to probe user memory including non-canonical user access addresses [0]: 4-level page tables: user-space mem: 0x0000000000000000 - 0x00007fffffffffff non-canonical: 0x0000800000000000 - 0xffff7fffffffffff 5-level page tables: user-space mem: 0x0000000000000000 - 0x00ffffffffffffff non-canonical: 0x0100000000000000 - 0xfeffffffffffffff The idea is that these helpers are complementary to the probe_user_read() and strncpy_from_unsafe_user() which probe user-only memory. Both added helpers here do the same, but for kernel-only addresses. Both set of helpers are going to be used for BPF tracing. They also explicitly avoid throwing the splat for non-canonical user addresses from 00c42373d397 ("x86-64: add warning for non-canonical user access address dereferences"). For compat, the current probe_kernel_read() and strncpy_from_unsafe() are left as-is. [0] Documentation/x86/x86_64/mm.txt Signed-off-by: Daniel Borkmann Signed-off-by: Alexei Starovoitov Cc: Linus Torvalds Cc: Masami Hiramatsu Cc: x86@kernel.org Link: https://lore.kernel.org/bpf/eefeefd769aa5a013531f491a71f0936779e916b.1572649915.git.daniel@iogearbox.net --- arch/x86/mm/Makefile | 2 +- arch/x86/mm/maccess.c | 43 +++++++++++++++++++++++++++++++++++++++++ include/linux/uaccess.h | 4 ++++ mm/maccess.c | 25 +++++++++++++++++++++++- 4 files changed, 72 insertions(+), 2 deletions(-) create mode 100644 arch/x86/mm/maccess.c diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 84373dc9b341..bbc68a54795e 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -13,7 +13,7 @@ CFLAGS_REMOVE_mem_encrypt_identity.o = -pg endif obj-y := init.o init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \ - pat.o pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o + pat.o pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o maccess.o # Make sure __phys_addr has no stackprotector nostackp := $(call cc-option, -fno-stack-protector) diff --git a/arch/x86/mm/maccess.c b/arch/x86/mm/maccess.c new file mode 100644 index 000000000000..f5b85bdc0535 --- /dev/null +++ b/arch/x86/mm/maccess.c @@ -0,0 +1,43 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include + +#ifdef CONFIG_X86_64 +static __always_inline u64 canonical_address(u64 vaddr, u8 vaddr_bits) +{ + return ((s64)vaddr << (64 - vaddr_bits)) >> (64 - vaddr_bits); +} + +static __always_inline bool invalid_probe_range(u64 vaddr) +{ + /* + * Range covering the highest possible canonical userspace address + * as well as non-canonical address range. For the canonical range + * we also need to include the userspace guard page. + */ + return vaddr < TASK_SIZE_MAX + PAGE_SIZE || + canonical_address(vaddr, boot_cpu_data.x86_virt_bits) != vaddr; +} +#else +static __always_inline bool invalid_probe_range(u64 vaddr) +{ + return vaddr < TASK_SIZE_MAX; +} +#endif + +long probe_kernel_read_strict(void *dst, const void *src, size_t size) +{ + if (unlikely(invalid_probe_range((unsigned long)src))) + return -EFAULT; + + return __probe_kernel_read(dst, src, size); +} + +long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, long count) +{ + if (unlikely(invalid_probe_range((unsigned long)unsafe_addr))) + return -EFAULT; + + return __strncpy_from_unsafe(dst, unsafe_addr, count); +} diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index 70941f49d66e..25ae650dcb1a 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -315,6 +315,7 @@ copy_struct_from_user(void *dst, size_t ksize, const void __user *src, * happens, handle that and return -EFAULT. */ extern long probe_kernel_read(void *dst, const void *src, size_t size); +extern long probe_kernel_read_strict(void *dst, const void *src, size_t size); extern long __probe_kernel_read(void *dst, const void *src, size_t size); /* @@ -354,6 +355,9 @@ extern long notrace probe_user_write(void __user *dst, const void *src, size_t s extern long notrace __probe_user_write(void __user *dst, const void *src, size_t size); extern long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count); +extern long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, + long count); +extern long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count); extern long strncpy_from_unsafe_user(char *dst, const void __user *unsafe_addr, long count); extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count); diff --git a/mm/maccess.c b/mm/maccess.c index 2d3c3d01064c..3ca8d97e5010 100644 --- a/mm/maccess.c +++ b/mm/maccess.c @@ -43,11 +43,20 @@ probe_write_common(void __user *dst, const void *src, size_t size) * do_page_fault() doesn't attempt to take mmap_sem. This makes * probe_kernel_read() suitable for use within regions where the caller * already holds mmap_sem, or other locks which nest inside mmap_sem. + * + * probe_kernel_read_strict() is the same as probe_kernel_read() except for + * the case where architectures have non-overlapping user and kernel address + * ranges: probe_kernel_read_strict() will additionally return -EFAULT for + * probing memory on a user address range where probe_user_read() is supposed + * to be used instead. */ long __weak probe_kernel_read(void *dst, const void *src, size_t size) __attribute__((alias("__probe_kernel_read"))); +long __weak probe_kernel_read_strict(void *dst, const void *src, size_t size) + __attribute__((alias("__probe_kernel_read"))); + long __probe_kernel_read(void *dst, const void *src, size_t size) { long ret; @@ -157,8 +166,22 @@ EXPORT_SYMBOL_GPL(probe_user_write); * * If @count is smaller than the length of the string, copies @count-1 bytes, * sets the last byte of @dst buffer to NUL and returns @count. + * + * strncpy_from_unsafe_strict() is the same as strncpy_from_unsafe() except + * for the case where architectures have non-overlapping user and kernel address + * ranges: strncpy_from_unsafe_strict() will additionally return -EFAULT for + * probing memory on a user address range where strncpy_from_unsafe_user() is + * supposed to be used instead. */ -long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) + +long __weak strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) + __attribute__((alias("__strncpy_from_unsafe"))); + +long __weak strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, + long count) + __attribute__((alias("__strncpy_from_unsafe"))); + +long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) { mm_segment_t old_fs = get_fs(); const void *src = unsafe_addr; From patchwork Mon May 22 20:33:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiri Olsa X-Patchwork-Id: 13251084 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 18089154AF for ; Mon, 22 May 2023 20:34:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BDC11C433EF; Mon, 22 May 2023 20:34:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684787669; bh=wU2N1BOhyNmMdvHXU/2VocFu1lHdohwPll/T20B+ryE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bvAxYlu5Q0RwXXod4VhvsywM0NXgCR44VayRB0M1gpt3A6LERtbk3mFI5hhO3w5PU flixUhFphzdspyyEVMn0I6QHJ696bgaToVc3xyjcC4kmOK9yvhx3PqOBOWA2D2aFkF 8ytcYLfMLqqvfYUe4OuFiCPd4TynSPkMxnhswgkE7/niNtDl6mjYCUg7Tp5pLMp9wx UR9zQpkB09m2STIPWImCiraooij2D52CkbHBJKcF5Wq/XBD+Bzjk1kgP+fFDtHXApH 7dD5myzKXsO7i9upbNvWN9nIuc1lzWAAriVD10AT8/69YaNunqgauei0EcVnj1MFuc sfIWwB+xT/mNQ== From: Jiri Olsa To: stable@vger.kernel.org Cc: Andrii Nakryiko , linux-mm@kvack.org, bpf@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Masami Hiramatsu , Tsahee Zidenberg , Andrii Nakryiko , Christoph Hellwig , Daniel Borkmann , Thomas Gleixner , =?utf-8?q?Mah=C3=A9_Tardy?= , linux-arm-kernel@lists.infradead.org Subject: [RFC PATCH stable 5.4 2/8] bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers Date: Mon, 22 May 2023 22:33:46 +0200 Message-Id: <20230522203352.738576-3-jolsa@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230522203352.738576-1-jolsa@kernel.org> References: <20230522203352.738576-1-jolsa@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC From: Daniel Borkmann commit 6ae08ae3dea2cfa03dd3665a3c8475c2d429ef47 upstream. [Taking only hunks that are related to probe_read and probe_read_str helpers, which we want to fix. Taking this patch/hunks as a base for following changes and ommiting the new helpers and uapi changes.] The current bpf_probe_read() and bpf_probe_read_str() helpers are broken in that they assume they can be used for probing memory access for kernel space addresses /as well as/ user space addresses. However, plain use of probe_kernel_read() for both cases will attempt to always access kernel space address space given access is performed under KERNEL_DS and some archs in-fact have overlapping address spaces where a kernel pointer and user pointer would have the /same/ address value and therefore accessing application memory via bpf_probe_read{,_str}() would read garbage values. Lets fix BPF side by making use of recently added 3d7081822f7f ("uaccess: Add non-pagefault user-space read functions"). Unfortunately, the only way to fix this status quo is to add dedicated bpf_probe_read_{user,kernel}() and bpf_probe_read_{user,kernel}_str() helpers. The bpf_probe_read{,_str}() helpers are kept as-is to retain their current behavior. The two *_user() variants attempt the access always under USER_DS set, the two *_kernel() variants will -EFAULT when accessing user memory if the underlying architecture has non-overlapping address ranges, also avoiding throwing the kernel warning via 00c42373d397 ("x86-64: add warning for non-canonical user access address dereferences"). Fixes: a5e8c07059d0 ("bpf: add bpf_probe_read_str helper") Fixes: 2541517c32be ("tracing, perf: Implement BPF programs attached to kprobes") Signed-off-by: Daniel Borkmann Signed-off-by: Alexei Starovoitov Acked-by: Andrii Nakryiko Link: https://lore.kernel.org/bpf/796ee46e948bc808d54891a1108435f8652c6ca4.1572649915.git.daniel@iogearbox.net --- kernel/trace/bpf_trace.c | 103 ++++++++++++++++++++++----------------- 1 file changed, 57 insertions(+), 46 deletions(-) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 1e1345cd21b4..9ac27d48cc8e 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -138,24 +138,70 @@ static const struct bpf_func_proto bpf_override_return_proto = { }; #endif -BPF_CALL_3(bpf_probe_read, void *, dst, u32, size, const void *, unsafe_ptr) +static __always_inline int +bpf_probe_read_kernel_common(void *dst, u32 size, const void *unsafe_ptr, + const bool compat) { - int ret; + int ret = security_locked_down(LOCKDOWN_BPF_READ); - ret = security_locked_down(LOCKDOWN_BPF_READ); - if (ret < 0) + if (unlikely(ret < 0)) goto out; - - ret = probe_kernel_read(dst, unsafe_ptr, size); + ret = compat ? probe_kernel_read(dst, unsafe_ptr, size) : + probe_kernel_read_strict(dst, unsafe_ptr, size); if (unlikely(ret < 0)) out: memset(dst, 0, size); + return ret; +} + +BPF_CALL_3(bpf_probe_read_compat, void *, dst, u32, size, + const void *, unsafe_ptr) +{ + return bpf_probe_read_kernel_common(dst, size, unsafe_ptr, true); +} + +static const struct bpf_func_proto bpf_probe_read_compat_proto = { + .func = bpf_probe_read_compat, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_UNINIT_MEM, + .arg2_type = ARG_CONST_SIZE_OR_ZERO, + .arg3_type = ARG_ANYTHING, +}; +static __always_inline int +bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr, + const bool compat) +{ + int ret = security_locked_down(LOCKDOWN_BPF_READ); + + if (unlikely(ret < 0)) + goto out; + /* + * The strncpy_from_unsafe_*() call will likely not fill the entire + * buffer, but that's okay in this circumstance as we're probing + * arbitrary memory anyway similar to bpf_probe_read_*() and might + * as well probe the stack. Thus, memory is explicitly cleared + * only in error case, so that improper users ignoring return + * code altogether don't copy garbage; otherwise length of string + * is returned that can be used for bpf_perf_event_output() et al. + */ + ret = compat ? strncpy_from_unsafe(dst, unsafe_ptr, size) : + strncpy_from_unsafe_strict(dst, unsafe_ptr, size); + if (unlikely(ret < 0)) +out: + memset(dst, 0, size); return ret; } -static const struct bpf_func_proto bpf_probe_read_proto = { - .func = bpf_probe_read, +BPF_CALL_3(bpf_probe_read_compat_str, void *, dst, u32, size, + const void *, unsafe_ptr) +{ + return bpf_probe_read_kernel_str_common(dst, size, unsafe_ptr, true); +} + +static const struct bpf_func_proto bpf_probe_read_compat_str_proto = { + .func = bpf_probe_read_compat_str, .gpl_only = true, .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_UNINIT_MEM, @@ -583,41 +629,6 @@ static const struct bpf_func_proto bpf_current_task_under_cgroup_proto = { .arg2_type = ARG_ANYTHING, }; -BPF_CALL_3(bpf_probe_read_str, void *, dst, u32, size, - const void *, unsafe_ptr) -{ - int ret; - - ret = security_locked_down(LOCKDOWN_BPF_READ); - if (ret < 0) - goto out; - - /* - * The strncpy_from_unsafe() call will likely not fill the entire - * buffer, but that's okay in this circumstance as we're probing - * arbitrary memory anyway similar to bpf_probe_read() and might - * as well probe the stack. Thus, memory is explicitly cleared - * only in error case, so that improper users ignoring return - * code altogether don't copy garbage; otherwise length of string - * is returned that can be used for bpf_perf_event_output() et al. - */ - ret = strncpy_from_unsafe(dst, unsafe_ptr, size); - if (unlikely(ret < 0)) -out: - memset(dst, 0, size); - - return ret; -} - -static const struct bpf_func_proto bpf_probe_read_str_proto = { - .func = bpf_probe_read_str, - .gpl_only = true, - .ret_type = RET_INTEGER, - .arg1_type = ARG_PTR_TO_UNINIT_MEM, - .arg2_type = ARG_CONST_SIZE_OR_ZERO, - .arg3_type = ARG_ANYTHING, -}; - struct send_signal_irq_work { struct irq_work irq_work; struct task_struct *task; @@ -700,8 +711,6 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_map_pop_elem_proto; case BPF_FUNC_map_peek_elem: return &bpf_map_peek_elem_proto; - case BPF_FUNC_probe_read: - return &bpf_probe_read_proto; case BPF_FUNC_ktime_get_ns: return &bpf_ktime_get_ns_proto; case BPF_FUNC_tail_call: @@ -728,8 +737,10 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_current_task_under_cgroup_proto; case BPF_FUNC_get_prandom_u32: return &bpf_get_prandom_u32_proto; + case BPF_FUNC_probe_read: + return &bpf_probe_read_compat_proto; case BPF_FUNC_probe_read_str: - return &bpf_probe_read_str_proto; + return &bpf_probe_read_compat_str_proto; #ifdef CONFIG_CGROUPS case BPF_FUNC_get_current_cgroup_id: return &bpf_get_current_cgroup_id_proto; From patchwork Mon May 22 20:33:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiri Olsa X-Patchwork-Id: 13251085 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D4D9D154AF for ; Mon, 22 May 2023 20:34:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DB531C433D2; Mon, 22 May 2023 20:34:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684787682; bh=1nSMve/w+XuFrnfWUZ8RsoJsky+kpG672VPmYmGZMVI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OjIfMDYxThs+94da/eP9DyEKjk1djZsZmlgvOcUGIyktqWEeB8FDfqbvX2OvN1Ndg 4LNUZF1YbqGcHFMGy/wYetTmlYq8TMdF5GxRgf8C25A1ztK1VriLnqHLM86GQymu+N UYv/1/079BM+ixi2hmsqeYBlLheImXe17wtzaveWoqaxNyL4B96eiccZWuQx+c9KKT iOIaHcGr+VxdNg5jK1RnmG9WcMbVc5OXeqScKHS85tqpBm4x+jT7qFp5OXLLDvS+75 GEV8VDoTAwzMkTCEIWJ0XQq3rYjyuGT9eB7fuONVoKHETfPESc+IkYgvd4IDAy1o+Q 7Mk933+rF7JDg== From: Jiri Olsa To: stable@vger.kernel.org Cc: Linus Torvalds , Masami Hiramatsu , Brendan Gregg , Christoph Hellwig , linux-mm@kvack.org, bpf@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Tsahee Zidenberg , Andrii Nakryiko , Daniel Borkmann , Thomas Gleixner , =?utf-8?q?Mah=C3=A9_Tardy?= , linux-arm-kernel@lists.infradead.org Subject: [RFC PATCH stable 5.4 3/8] bpf: Restrict bpf_probe_read{, str}() only to archs where they work Date: Mon, 22 May 2023 22:33:47 +0200 Message-Id: <20230522203352.738576-4-jolsa@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230522203352.738576-1-jolsa@kernel.org> References: <20230522203352.738576-1-jolsa@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC From: Daniel Borkmann commit 0ebeea8ca8a4d1d453ad299aef0507dab04f6e8d upstream. [Small context conflicts due to not bckported changes in previous patch] Given the legacy bpf_probe_read{,str}() BPF helpers are broken on archs with overlapping address ranges, we should really take the next step to disable them from BPF use there. To generally fix the situation, we've recently added new helper variants bpf_probe_read_{user,kernel}() and bpf_probe_read_{user,kernel}_str(). For details on them, see 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user,kernel}_str helpers"). Given bpf_probe_read{,str}() have been around for ~5 years by now, there are plenty of users at least on x86 still relying on them today, so we cannot remove them entirely w/o breaking the BPF tracing ecosystem. However, their use should be restricted to archs with non-overlapping address ranges where they are working in their current form. Therefore, move this behind a CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE and have x86, arm64, arm select it (other archs supporting it can follow-up on it as well). For the remaining archs, they can workaround easily by relying on the feature probe from bpftool which spills out defines that can be used out of BPF C code to implement the drop-in replacement for old/new kernels via: bpftool feature probe macro Suggested-by: Linus Torvalds Signed-off-by: Daniel Borkmann Signed-off-by: Alexei Starovoitov Reviewed-by: Masami Hiramatsu Acked-by: Linus Torvalds Cc: Brendan Gregg Cc: Christoph Hellwig Link: https://lore.kernel.org/bpf/20200515101118.6508-2-daniel@iogearbox.net --- arch/arm/Kconfig | 1 + arch/arm64/Kconfig | 1 + arch/x86/Kconfig | 1 + init/Kconfig | 3 +++ kernel/trace/bpf_trace.c | 2 ++ 5 files changed, 8 insertions(+) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index a70696a95b79..7c1cb0ebdb18 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -14,6 +14,7 @@ config ARM select ARCH_HAS_KEEPINITRD select ARCH_HAS_KCOV select ARCH_HAS_MEMBARRIER_SYNC_CORE + select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE select ARCH_HAS_PTE_SPECIAL if ARM_LPAE select ARCH_HAS_PHYS_TO_DMA select ARCH_HAS_SETUP_DMA_OPS diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 384b1bf56667..0d96acb2ca3e 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -22,6 +22,7 @@ config ARM64 select ARCH_HAS_KCOV select ARCH_HAS_KEEPINITRD select ARCH_HAS_MEMBARRIER_SYNC_CORE + select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE select ARCH_HAS_PTE_DEVMAP select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_SETUP_DMA_OPS diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6002252692af..7be388116732 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -70,6 +70,7 @@ config X86 select ARCH_HAS_KCOV if X86_64 select ARCH_HAS_MEM_ENCRYPT select ARCH_HAS_MEMBARRIER_SYNC_CORE + select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE select ARCH_HAS_PMEM_API if X86_64 select ARCH_HAS_PTE_DEVMAP if X86_64 select ARCH_HAS_PTE_SPECIAL diff --git a/init/Kconfig b/init/Kconfig index f641518f4ac5..2297b7ce6665 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2231,6 +2231,9 @@ config ASN1 source "kernel/Kconfig.locks" +config ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE + bool + config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE bool diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 9ac27d48cc8e..61c81c38202b 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -737,10 +737,12 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_current_task_under_cgroup_proto; case BPF_FUNC_get_prandom_u32: return &bpf_get_prandom_u32_proto; +#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE case BPF_FUNC_probe_read: return &bpf_probe_read_compat_proto; case BPF_FUNC_probe_read_str: return &bpf_probe_read_compat_str_proto; +#endif #ifdef CONFIG_CGROUPS case BPF_FUNC_get_current_cgroup_id: return &bpf_get_current_cgroup_id_proto; From patchwork Mon May 22 20:33:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiri Olsa X-Patchwork-Id: 13251086 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BEDE3154AF for ; Mon, 22 May 2023 20:34:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9E043C4339C; Mon, 22 May 2023 20:34:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684787695; bh=XIX2yDn63uQ9xxgBYNjPu4x/2B0UpeRMuNabTt6UxWo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=I2rAgK7xltyoIWKVyVkj6rwC5zqUbLDvW5Rq9rbpA8rTmzVfu5aNbZ1Ge4E3dW7xd wPuDx9IGe7R/m/hwaRUlNIR6ZLSSOJt1DW68fT7ppAeZo00YdTnTNFWpnzYapX64dR mApCcZpG+GnWsKaSaIDTYq2+X5u7fNghE2GoFwfC9ihE7Q7iGC+asOT7M57iXsMze/ DCtwgF6Rb1yYhr7qruSP2JpqhOCh4wH6lxnZEB1PqUEFqw2URi9dmRCQeO8wy75w0I gJn82MRQXTiHUFBsMXtqpCwKQ5b/sf8D1y0CMz1MDJofx6Dn6jH2SYWTuTYJ/KnpFM QwLIQw3Wxm1nQ== From: Jiri Olsa To: stable@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , "H. Peter Anvin" , Ingo Molnar , Masami Hiramatsu , Thomas Gleixner , linux-mm@kvack.org, bpf@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Tsahee Zidenberg , Andrii Nakryiko , Christoph Hellwig , =?utf-8?q?Mah=C3=A9_Tardy?= , linux-arm-kernel@lists.infradead.org Subject: [RFC PATCH stable 5.4 4/8] maccess: clarify kerneldoc comments Date: Mon, 22 May 2023 22:33:48 +0200 Message-Id: <20230522203352.738576-5-jolsa@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230522203352.738576-1-jolsa@kernel.org> References: <20230522203352.738576-1-jolsa@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-State: RFC From: Christoph Hellwig commit 4f6de12b375c37ba51f9412be7ed6ab44a7f71d8 upstream. Add proper kerneldoc comments for probe_kernel_read_strict and probe_kernel_read strncpy_from_unsafe_strict and explain the different versus the non-strict version. Signed-off-by: Christoph Hellwig Signed-off-by: Andrew Morton Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Masami Hiramatsu Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20200521152301.2587579-5-hch@lst.de Signed-off-by: Linus Torvalds --- mm/maccess.c | 60 +++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 43 insertions(+), 17 deletions(-) diff --git a/mm/maccess.c b/mm/maccess.c index 3ca8d97e5010..d263c7b5e4eb 100644 --- a/mm/maccess.c +++ b/mm/maccess.c @@ -31,29 +31,36 @@ probe_write_common(void __user *dst, const void *src, size_t size) } /** - * probe_kernel_read(): safely attempt to read from a kernel-space location + * probe_kernel_read(): safely attempt to read from any location * @dst: pointer to the buffer that shall take the data * @src: address to read from * @size: size of the data chunk * - * Safely read from address @src to the buffer at @dst. If a kernel fault - * happens, handle that and return -EFAULT. + * Same as probe_kernel_read_strict() except that for architectures with + * not fully separated user and kernel address spaces this function also works + * for user address tanges. + * + * DO NOT USE THIS FUNCTION - it is broken on architectures with entirely + * separate kernel and user address spaces, and also a bad idea otherwise. + */ +long __weak probe_kernel_read(void *dst, const void *src, size_t size) + __attribute__((alias("__probe_kernel_read"))); + +/** + * probe_kernel_read_strict(): safely attempt to read from kernel-space + * @dst: pointer to the buffer that shall take the data + * @src: address to read from + * @size: size of the data chunk + * + * Safely read from kernel address @src to the buffer at @dst. If a kernel + * fault happens, handle that and return -EFAULT. * * We ensure that the copy_from_user is executed in atomic context so that * do_page_fault() doesn't attempt to take mmap_sem. This makes * probe_kernel_read() suitable for use within regions where the caller * already holds mmap_sem, or other locks which nest inside mmap_sem. - * - * probe_kernel_read_strict() is the same as probe_kernel_read() except for - * the case where architectures have non-overlapping user and kernel address - * ranges: probe_kernel_read_strict() will additionally return -EFAULT for - * probing memory on a user address range where probe_user_read() is supposed - * to be used instead. */ -long __weak probe_kernel_read(void *dst, const void *src, size_t size) - __attribute__((alias("__probe_kernel_read"))); - long __weak probe_kernel_read_strict(void *dst, const void *src, size_t size) __attribute__((alias("__probe_kernel_read"))); @@ -167,16 +174,35 @@ EXPORT_SYMBOL_GPL(probe_user_write); * If @count is smaller than the length of the string, copies @count-1 bytes, * sets the last byte of @dst buffer to NUL and returns @count. * - * strncpy_from_unsafe_strict() is the same as strncpy_from_unsafe() except - * for the case where architectures have non-overlapping user and kernel address - * ranges: strncpy_from_unsafe_strict() will additionally return -EFAULT for - * probing memory on a user address range where strncpy_from_unsafe_user() is - * supposed to be used instead. + * Same as strncpy_from_unsafe_strict() except that for architectures with + * not fully separated user and kernel address spaces this function also works + * for user address tanges. + * + * DO NOT USE THIS FUNCTION - it is broken on architectures with entirely + * separate kernel and user address spaces, and also a bad idea otherwise. */ long __weak strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) __attribute__((alias("__strncpy_from_unsafe"))); +/** + * strncpy_from_unsafe_strict: - Copy a NUL terminated string from unsafe + * address. + * @dst: Destination address, in kernel space. This buffer must be at + * least @count bytes long. + * @unsafe_addr: Unsafe address. + * @count: Maximum number of bytes to copy, including the trailing NUL. + * + * Copies a NUL-terminated string from unsafe address to kernel buffer. + * + * On success, returns the length of the string INCLUDING the trailing NUL. + * + * If access fails, returns -EFAULT (some data may have been copied + * and the trailing NUL added). + * + * If @count is smaller than the length of the string, copies @count-1 bytes, + * sets the last byte of @dst buffer to NUL and returns @count. + */ long __weak strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, long count) __attribute__((alias("__strncpy_from_unsafe"))); From patchwork Mon May 22 20:33:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiri Olsa X-Patchwork-Id: 13251087 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C6C87154AF for ; Mon, 22 May 2023 20:35:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8A771C4339B; Mon, 22 May 2023 20:35:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684787713; bh=eHFLwkwnqJLV1skqiSQb1kO36BXLt5pZkTr7pZXoGQ8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZIotzxhO3HwG38hcM6aijGpyjjnPs6cQsHkoiUDuxBLolfFQK75I5fpMs/cB2wf6V mR28ahuPQ/KT51PcT/869TcBM4E13dtkg3oN4VqdX5vLB1tTSzCswlUWwoW0mG/Nk9 unbDxoFcjbze+24r3T8thJEGfCNznl3ceEouvKiqvESC0RkGfcvdzjJdkqd17+YN8+ ewzuQBUBiMKT3knw5DWvLs+LUTtQ4INx5qw1R5jX86s12NeJgwmfyzJT7vPao72NWZ 9cyMZflIK3W6SdeIKtGaRSmhUVl67wYenrT+LW5Mw1ZSktfTIWARdzWVH3DENzNnMG TkYl06+GRoHFw== From: Jiri Olsa To: stable@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , "H. Peter Anvin" , Ingo Molnar , Masami Hiramatsu , Thomas Gleixner , linux-mm@kvack.org, bpf@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Tsahee Zidenberg , Andrii Nakryiko , Christoph Hellwig , =?utf-8?q?Mah=C3=A9_Tardy?= , linux-arm-kernel@lists.infradead.org Subject: [RFC PATCH stable 5.4 5/8] maccess: rename strncpy_from_unsafe_user to strncpy_from_user_nofault Date: Mon, 22 May 2023 22:33:49 +0200 Message-Id: <20230522203352.738576-6-jolsa@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230522203352.738576-1-jolsa@kernel.org> References: <20230522203352.738576-1-jolsa@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-State: RFC From: Christoph Hellwig commit bd88bb5d4007949be7154deae7cef7173c751a95 upstream. [Missing bpf_trace.c hunk due to not backported changes] This matches the naming of strncpy_from_user, and also makes it more clear what the function is supposed to do. Signed-off-by: Christoph Hellwig Signed-off-by: Andrew Morton Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Masami Hiramatsu Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20200521152301.2587579-7-hch@lst.de Signed-off-by: Linus Torvalds --- include/linux/uaccess.h | 4 ++-- kernel/trace/trace_kprobe.c | 2 +- mm/maccess.c | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index 25ae650dcb1a..23bda5df4c08 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -358,8 +358,8 @@ extern long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count); extern long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, long count); extern long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count); -extern long strncpy_from_unsafe_user(char *dst, const void __user *unsafe_addr, - long count); +long strncpy_from_user_nofault(char *dst, const void __user *unsafe_addr, + long count); extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count); /** diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index a422cf6a0358..d6ba4f6bed73 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -1111,7 +1111,7 @@ fetch_store_string_user(unsigned long addr, void *dest, void *base) __dest = get_loc_data(dest, base); - ret = strncpy_from_unsafe_user(__dest, uaddr, maxlen); + ret = strncpy_from_user_nofault(__dest, uaddr, maxlen); if (ret >= 0) *(u32 *)dest = make_data_loc(ret, __dest - base); diff --git a/mm/maccess.c b/mm/maccess.c index d263c7b5e4eb..8e4d564b6c25 100644 --- a/mm/maccess.c +++ b/mm/maccess.c @@ -231,7 +231,7 @@ long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) } /** - * strncpy_from_unsafe_user: - Copy a NUL terminated string from unsafe user + * strncpy_from_user_nofault: - Copy a NUL terminated string from unsafe user * address. * @dst: Destination address, in kernel space. This buffer must be at * least @count bytes long. @@ -248,7 +248,7 @@ long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) * If @count is smaller than the length of the string, copies @count-1 bytes, * sets the last byte of @dst buffer to NUL and returns @count. */ -long strncpy_from_unsafe_user(char *dst, const void __user *unsafe_addr, +long strncpy_from_user_nofault(char *dst, const void __user *unsafe_addr, long count) { mm_segment_t old_fs = get_fs(); From patchwork Mon May 22 20:33:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiri Olsa X-Patchwork-Id: 13251088 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6EDB0154AF for ; Mon, 22 May 2023 20:35:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EFB53C433D2; Mon, 22 May 2023 20:35:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684787727; bh=WBBOfaVUfGGoU8Hmkitpa+xWNdHXkkJPQthgTtdReZ4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=j5yYDFtBdzXqePyI8lpdVEi04496pv3AyvnZk6ScW1UV2DQ8acR1fu0Zjimr1Fttz GZftk2wWppltBRYHQQ2jJ8mlmHIW3tCIDetovhFUMWy14LPamrHwm7F4KVb0+Z+Rhb 87DD2hTbfB1dXY23B8HvEaQME2y8JC3ubY8sc+/1NPUuopB62F22XQCvkD0QCFdgvs GFDgVkUM1/RK9m+johxQwFTnguhsEIQEKfVlQqyHvMDfSNmCQA4FdjjqD19qelJAyJ HKoyuzpm/zHjF/7cy1FHls7U91J2KKAj0s/6oWCbkfvkU0sP9r6+7je1kzk/P36gQb 7yh4kJdcOSfVw== From: Jiri Olsa To: stable@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , "H. Peter Anvin" , Ingo Molnar , Masami Hiramatsu , Thomas Gleixner , linux-mm@kvack.org, bpf@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Tsahee Zidenberg , Andrii Nakryiko , Christoph Hellwig , =?utf-8?q?Mah=C3=A9_Tardy?= , linux-arm-kernel@lists.infradead.org Subject: [RFC PATCH stable 5.4 6/8] maccess: rename strncpy_from_unsafe_strict to strncpy_from_kernel_nofault Date: Mon, 22 May 2023 22:33:50 +0200 Message-Id: <20230522203352.738576-7-jolsa@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230522203352.738576-1-jolsa@kernel.org> References: <20230522203352.738576-1-jolsa@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC From: Christoph Hellwig commit c4cb164426aebd635baa53685b0ebf1a127e9803 upstream. [Missing bpf_trace_printk due to not backported changes] This matches the naming of strncpy_from_user_nofault, and also makes it more clear what the function is supposed to do. Signed-off-by: Christoph Hellwig Signed-off-by: Andrew Morton Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Masami Hiramatsu Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20200521152301.2587579-8-hch@lst.de Signed-off-by: Linus Torvalds --- arch/x86/mm/maccess.c | 2 +- include/linux/uaccess.h | 4 ++-- kernel/trace/bpf_trace.c | 2 +- mm/maccess.c | 6 +++--- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/maccess.c b/arch/x86/mm/maccess.c index f5b85bdc0535..62c4017a2473 100644 --- a/arch/x86/mm/maccess.c +++ b/arch/x86/mm/maccess.c @@ -34,7 +34,7 @@ long probe_kernel_read_strict(void *dst, const void *src, size_t size) return __probe_kernel_read(dst, src, size); } -long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, long count) +long strncpy_from_kernel_nofault(char *dst, const void *unsafe_addr, long count) { if (unlikely(invalid_probe_range((unsigned long)unsafe_addr))) return -EFAULT; diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index 23bda5df4c08..7a926c5b77ce 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -355,8 +355,8 @@ extern long notrace probe_user_write(void __user *dst, const void *src, size_t s extern long notrace __probe_user_write(void __user *dst, const void *src, size_t size); extern long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count); -extern long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, - long count); +long strncpy_from_kernel_nofault(char *dst, const void *unsafe_addr, + long count); extern long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count); long strncpy_from_user_nofault(char *dst, const void __user *unsafe_addr, long count); diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 61c81c38202b..d1fd13a47bdf 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -187,7 +187,7 @@ bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr, * is returned that can be used for bpf_perf_event_output() et al. */ ret = compat ? strncpy_from_unsafe(dst, unsafe_ptr, size) : - strncpy_from_unsafe_strict(dst, unsafe_ptr, size); + strncpy_from_kernel_nofault(dst, unsafe_ptr, size); if (unlikely(ret < 0)) out: memset(dst, 0, size); diff --git a/mm/maccess.c b/mm/maccess.c index 8e4d564b6c25..8cfe21dfc953 100644 --- a/mm/maccess.c +++ b/mm/maccess.c @@ -174,7 +174,7 @@ EXPORT_SYMBOL_GPL(probe_user_write); * If @count is smaller than the length of the string, copies @count-1 bytes, * sets the last byte of @dst buffer to NUL and returns @count. * - * Same as strncpy_from_unsafe_strict() except that for architectures with + * Same as strncpy_from_kernel_nofault() except that for architectures with * not fully separated user and kernel address spaces this function also works * for user address tanges. * @@ -186,7 +186,7 @@ long __weak strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) __attribute__((alias("__strncpy_from_unsafe"))); /** - * strncpy_from_unsafe_strict: - Copy a NUL terminated string from unsafe + * strncpy_from_kernel_nofault: - Copy a NUL terminated string from unsafe * address. * @dst: Destination address, in kernel space. This buffer must be at * least @count bytes long. @@ -203,7 +203,7 @@ long __weak strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) * If @count is smaller than the length of the string, copies @count-1 bytes, * sets the last byte of @dst buffer to NUL and returns @count. */ -long __weak strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, +long __weak strncpy_from_kernel_nofault(char *dst, const void *unsafe_addr, long count) __attribute__((alias("__strncpy_from_unsafe"))); From patchwork Mon May 22 20:33:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiri Olsa X-Patchwork-Id: 13251089 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6C398171D7 for ; Mon, 22 May 2023 20:35:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 05A3CC4339B; Mon, 22 May 2023 20:35:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684787740; bh=hPFjOzuSnoN/PTZCIZOnbk80dddEgF6LEUhE4eMCblA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=W5XbiWtuhMenbbX2eUsoinsN+8nRR/kk7KRzLM8u5n0ct6UJ6TQgEMjP7g7qlnGGs 2OVjjAHxKwg5pjH9dqMzfV222zpLYZ/4BpTR7v7ZCSUSb9jtV8BcjfGrC0a/Y7Nhe8 72PtsNoXrMmSGA5Rtsqncf24gdJ8qlDxXHN3WtRUJKeMt7zKVZkNWBC+ElyUbnBE3r 9+8EDRyGXj7XaXJTyEfdXze96jJhrwyIlAESgXyTKCXz+LfKhh7NDX+ciCPw2ypc2E usyZxZNSzHp7HWwnbbdBDf9vxIu5tWnhCC7dSJYUjPRNH7Uy7lq+/9b+s74HUGmrBx KVs56ZQh4lioA== From: Jiri Olsa To: stable@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , "H. Peter Anvin" , Ingo Molnar , Masami Hiramatsu , Thomas Gleixner , linux-mm@kvack.org, bpf@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Tsahee Zidenberg , Andrii Nakryiko , Christoph Hellwig , =?utf-8?q?Mah=C3=A9_Tardy?= , linux-arm-kernel@lists.infradead.org Subject: [RFC PATCH stable 5.4 7/8] bpf: rework the compat kernel probe handling Date: Mon, 22 May 2023 22:33:51 +0200 Message-Id: <20230522203352.738576-8-jolsa@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230522203352.738576-1-jolsa@kernel.org> References: <20230522203352.738576-1-jolsa@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC From: Christoph Hellwig commit 8d92db5c04d10381f4db70ed99b1b576f5db18a7 upstream. [Conflicts due to applying hunks only to the functions that were taken in 6ae08ae3dea2 upstream commit backport earlier] Instead of using the dangerous probe_kernel_read and strncpy_from_unsafe helpers, rework the compat probes to check if an address is a kernel or userspace one, and then use the low-level kernel or user probe helper shared by the proper kernel and user probe helpers. This slightly changes behavior as the compat probe on a user address doesn't check the lockdown flags, just as the pure user probes do. Signed-off-by: Christoph Hellwig Signed-off-by: Andrew Morton Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Masami Hiramatsu Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20200521152301.2587579-14-hch@lst.de Signed-off-by: Linus Torvalds --- kernel/trace/bpf_trace.c | 93 +++++++++++++++++++++++++++------------- 1 file changed, 64 insertions(+), 29 deletions(-) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index d1fd13a47bdf..a46256f99229 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -139,65 +139,99 @@ static const struct bpf_func_proto bpf_override_return_proto = { #endif static __always_inline int -bpf_probe_read_kernel_common(void *dst, u32 size, const void *unsafe_ptr, - const bool compat) +bpf_probe_read_user_common(void *dst, u32 size, const void __user *unsafe_ptr) { - int ret = security_locked_down(LOCKDOWN_BPF_READ); + int ret; + ret = probe_user_read(dst, unsafe_ptr, size); if (unlikely(ret < 0)) - goto out; - ret = compat ? probe_kernel_read(dst, unsafe_ptr, size) : - probe_kernel_read_strict(dst, unsafe_ptr, size); - if (unlikely(ret < 0)) -out: memset(dst, 0, size); return ret; } -BPF_CALL_3(bpf_probe_read_compat, void *, dst, u32, size, - const void *, unsafe_ptr) +static __always_inline int +bpf_probe_read_user_str_common(void *dst, u32 size, + const void __user *unsafe_ptr) { - return bpf_probe_read_kernel_common(dst, size, unsafe_ptr, true); + int ret; + + ret = strncpy_from_user_nofault(dst, unsafe_ptr, size); + if (unlikely(ret < 0)) + memset(dst, 0, size); + return ret; } -static const struct bpf_func_proto bpf_probe_read_compat_proto = { - .func = bpf_probe_read_compat, - .gpl_only = true, - .ret_type = RET_INTEGER, - .arg1_type = ARG_PTR_TO_UNINIT_MEM, - .arg2_type = ARG_CONST_SIZE_OR_ZERO, - .arg3_type = ARG_ANYTHING, -}; +static __always_inline int +bpf_probe_read_kernel_common(void *dst, u32 size, const void *unsafe_ptr) +{ + int ret = security_locked_down(LOCKDOWN_BPF_READ); + + if (unlikely(ret < 0)) + goto fail; + ret = probe_kernel_read_strict(dst, unsafe_ptr, size); + if (unlikely(ret < 0)) + goto fail; + return ret; +fail: + memset(dst, 0, size); + return ret; +} static __always_inline int -bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr, - const bool compat) +bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr) { int ret = security_locked_down(LOCKDOWN_BPF_READ); if (unlikely(ret < 0)) - goto out; + goto fail; + /* - * The strncpy_from_unsafe_*() call will likely not fill the entire - * buffer, but that's okay in this circumstance as we're probing + * The strncpy_from_kernel_nofault() call will likely not fill the + * entire buffer, but that's okay in this circumstance as we're probing * arbitrary memory anyway similar to bpf_probe_read_*() and might * as well probe the stack. Thus, memory is explicitly cleared * only in error case, so that improper users ignoring return * code altogether don't copy garbage; otherwise length of string * is returned that can be used for bpf_perf_event_output() et al. */ - ret = compat ? strncpy_from_unsafe(dst, unsafe_ptr, size) : - strncpy_from_kernel_nofault(dst, unsafe_ptr, size); + ret = strncpy_from_kernel_nofault(dst, unsafe_ptr, size); if (unlikely(ret < 0)) -out: - memset(dst, 0, size); + goto fail; + + return 0; +fail: + memset(dst, 0, size); return ret; } +#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE +BPF_CALL_3(bpf_probe_read_compat, void *, dst, u32, size, + const void *, unsafe_ptr) +{ + if ((unsigned long)unsafe_ptr < TASK_SIZE) { + return bpf_probe_read_user_common(dst, size, + (__force void __user *)unsafe_ptr); + } + return bpf_probe_read_kernel_common(dst, size, unsafe_ptr); +} + +static const struct bpf_func_proto bpf_probe_read_compat_proto = { + .func = bpf_probe_read_compat, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_UNINIT_MEM, + .arg2_type = ARG_CONST_SIZE_OR_ZERO, + .arg3_type = ARG_ANYTHING, +}; + BPF_CALL_3(bpf_probe_read_compat_str, void *, dst, u32, size, const void *, unsafe_ptr) { - return bpf_probe_read_kernel_str_common(dst, size, unsafe_ptr, true); + if ((unsigned long)unsafe_ptr < TASK_SIZE) { + return bpf_probe_read_user_str_common(dst, size, + (__force void __user *)unsafe_ptr); + } + return bpf_probe_read_kernel_str_common(dst, size, unsafe_ptr); } static const struct bpf_func_proto bpf_probe_read_compat_str_proto = { @@ -208,6 +242,7 @@ static const struct bpf_func_proto bpf_probe_read_compat_str_proto = { .arg2_type = ARG_CONST_SIZE_OR_ZERO, .arg3_type = ARG_ANYTHING, }; +#endif /* CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE */ BPF_CALL_3(bpf_probe_write_user, void __user *, unsafe_ptr, const void *, src, u32, size) From patchwork Mon May 22 20:33:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiri Olsa X-Patchwork-Id: 13251090 X-Patchwork-Delegate: bpf@iogearbox.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D91FC168CC for ; Mon, 22 May 2023 20:35:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7AD92C433D2; Mon, 22 May 2023 20:35:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684787752; bh=0nQ8xIPSPhBIr1gc3+npYhNke4IXHy+5EM39FqGHhHM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RvIIl7JwcJq1f1lT50bQTaPuLv0NZy78hJqb4UTvWapPNbd41FW80wvaW4pH0aAOo 4TFqg8491i9Zl3rQRFtYXMLVrUEjAIN2Sus2yH0U8HOELjOAhTmZZlj6/RVSV3KFvt Xe9iqVfgcL67vJ4LphGTNfl2Ao1Ub0EQ7bS13b2S6nILe6dKwQjN4s3A5ifjdgD6B8 HKSBErLqGZ2KGI5RO4zGnfoSCGGwg236gdTcm9UFy3mIkKjxjyKFjwyOA5VTnN1Ir6 STLK7od7jNeP4m7FYg/VWUGu+uTiQw4V/Ub1n/8uiGlC7+KimKstqJvqYoDpmxE5EM cl8yPMWAoz3Ag== From: Jiri Olsa To: stable@vger.kernel.org Cc: Christoph Hellwig , John Fastabend , linux-mm@kvack.org, bpf@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Masami Hiramatsu , Tsahee Zidenberg , Andrii Nakryiko , Daniel Borkmann , Thomas Gleixner , =?utf-8?q?Mah=C3=A9_Tardy?= , linux-arm-kernel@lists.infradead.org Subject: [RFC PATCH stable 5.4 8/8] bpf: bpf_probe_read_kernel_str() has to return amount of data read on success Date: Mon, 22 May 2023 22:33:52 +0200 Message-Id: <20230522203352.738576-9-jolsa@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230522203352.738576-1-jolsa@kernel.org> References: <20230522203352.738576-1-jolsa@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC From: Andrii Nakryiko commit 02553b91da5deb63c8562b47529b09b734659af0 upstream. During recent refactorings, bpf_probe_read_kernel_str() started returning 0 on success, instead of amount of data successfully read. This majorly breaks applications relying on bpf_probe_read_kernel_str() and bpf_probe_read_str() and their results. Fix this by returning actual number of bytes read. Fixes: 8d92db5c04d1 ("bpf: rework the compat kernel probe handling") Signed-off-by: Andrii Nakryiko Signed-off-by: Daniel Borkmann Reviewed-by: Christoph Hellwig Acked-by: John Fastabend Link: https://lore.kernel.org/bpf/20200616050432.1902042-1-andriin@fb.com --- kernel/trace/bpf_trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index a46256f99229..c4c825dcdef8 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -198,7 +198,7 @@ bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr) if (unlikely(ret < 0)) goto fail; - return 0; + return ret; fail: memset(dst, 0, size); return ret;