From patchwork Fri Mar 28 19:26:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathan Chancellor X-Patchwork-Id: 14032375 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 654BE1E32DD; Fri, 28 Mar 2025 19:26:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743190009; cv=none; b=q8U8XrEnAb3EqVDYqukZU5mEEDKThn8ni7yRhSe3rrSP47b/0XPs8fUJrtGZe1ds+jznxANF1mdu4JMKT8QwEUsruLl3Y0/FLXe+8Bf3OZGr9vodiU4cEmVSAEuHPiBwwPnEnwb9JTGqaa5wR5S97Naw7kDqNT7yZU2cgciRPgc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743190009; c=relaxed/simple; bh=J6y6kun0b/WOkdFoozDuJNbed7857mJ4vfc+MUpdEkI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=if3bQH0MkkCNMt3+3a4qBix7EGD8Wa2UQ+OPA38TI0zee3oNUHy4gLq3Mj+0Po+hKHp4tRYgOelNyolZZchEUfp+nl4ZcStBwU5qxvQ0RNB0wovW5P7NepsQYHfZjyIF9lb1Zvohm1nfCOxUkvBfbwyVU3PfzqG/s9VtC0fuSeo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=roO8Lv/W; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="roO8Lv/W" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 14245C4CEEB; Fri, 28 Mar 2025 19:26:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743190008; bh=J6y6kun0b/WOkdFoozDuJNbed7857mJ4vfc+MUpdEkI=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=roO8Lv/W6Ze3RNUd+uc/KcaOx471RG/J6D4v+91m9iRSHw963IBZZc/+vQ4PZ9Vdw vYencK8psqe9SQfBigMEbLi8VezoZuTm3BgSZPCp6czoDfkVCL0tWiQ39xgIJiNZwp zpY2uyZAF3b/1SwRhF1eANiKURP9uQysmWNCTFLXXtIsdWCUK8JT/p0kvev+1NImmJ 9UFfy95nGf1HO4KwXcDzKhWObHJ9dtysekWn6zMjS5gME9llJZVQd33WhXCnoHgi8S TBWNIOuMcFFdRy8sNl6VZsmZCkl0sQrv1/+bpwhEnelet+I1grWFlG/zYD2N5t6CPb WbmtbyXAPBdKQ== From: Nathan Chancellor Date: Fri, 28 Mar 2025 12:26:31 -0700 Subject: [PATCH v3 1/2] include: Move typedefs in nls.h to their own header Precedence: bulk X-Mailing-List: linux-hardening@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250328-string-add-wcslen-for-llvm-opt-v3-1-a180b4c0c1c4@kernel.org> References: <20250328-string-add-wcslen-for-llvm-opt-v3-0-a180b4c0c1c4@kernel.org> In-Reply-To: <20250328-string-add-wcslen-for-llvm-opt-v3-0-a180b4c0c1c4@kernel.org> To: Kees Cook Cc: Andy Shevchenko , Nick Desaulniers , Bill Wendling , Justin Stitt , linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, llvm@lists.linux.dev, stable@vger.kernel.org, Nathan Chancellor X-Mailer: b4 0.15-dev X-Developer-Signature: v=1; a=openpgp-sha256; l=2632; i=nathan@kernel.org; h=from:subject:message-id; bh=J6y6kun0b/WOkdFoozDuJNbed7857mJ4vfc+MUpdEkI=; b=owGbwMvMwCUmm602sfCA1DTG02pJDOnPvn+x/hG5e0v6eqZr27LnfN+s+W3uSwsTzardkpFP6 htPN7uc6ihlYRDjYpAVU2Spfqx63NBwzlnGG6cmwcxhZQIZwsDFKQATadzA8M9sx3JHzuSXa4uc Y3k+Zh72uMEvzprNG5h7bu9W4VcrFH4y/E9flXdpdknLvHc8dj9Pbn8l2/r7ldg1AxERywiZqu2 zFvMDAA== X-Developer-Key: i=nathan@kernel.org; a=openpgp; fpr=2437CB76E544CB6AB3D9DFD399739260CB6CB716 In order to allow commonly included headers such as string.h to access typedefs such as wchar_t without running into issues with the rest of the NLS library, refactor the typedefs out into their own header that can be included in a much safer manner. Cc: stable@vger.kernel.org Reviewed-by: Andy Shevchenko Signed-off-by: Nathan Chancellor --- include/linux/nls.h | 19 +------------------ include/linux/nls_types.h | 26 ++++++++++++++++++++++++++ 2 files changed, 27 insertions(+), 18 deletions(-) diff --git a/include/linux/nls.h b/include/linux/nls.h index e0bf8367b274..3d416d1f60b6 100644 --- a/include/linux/nls.h +++ b/include/linux/nls.h @@ -3,24 +3,7 @@ #define _LINUX_NLS_H #include - -/* Unicode has changed over the years. Unicode code points no longer - * fit into 16 bits; as of Unicode 5 valid code points range from 0 - * to 0x10ffff (17 planes, where each plane holds 65536 code points). - * - * The original decision to represent Unicode characters as 16-bit - * wchar_t values is now outdated. But plane 0 still includes the - * most commonly used characters, so we will retain it. The newer - * 32-bit unicode_t type can be used when it is necessary to - * represent the full Unicode character set. - */ - -/* Plane-0 Unicode character */ -typedef u16 wchar_t; -#define MAX_WCHAR_T 0xffff - -/* Arbitrary Unicode character */ -typedef u32 unicode_t; +#include struct nls_table { const char *charset; diff --git a/include/linux/nls_types.h b/include/linux/nls_types.h new file mode 100644 index 000000000000..9479df1016da --- /dev/null +++ b/include/linux/nls_types.h @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_NLS_TYPES_H +#define _LINUX_NLS_TYPES_H + +#include + +/* + * Unicode has changed over the years. Unicode code points no longer + * fit into 16 bits; as of Unicode 5 valid code points range from 0 + * to 0x10ffff (17 planes, where each plane holds 65536 code points). + * + * The original decision to represent Unicode characters as 16-bit + * wchar_t values is now outdated. But plane 0 still includes the + * most commonly used characters, so we will retain it. The newer + * 32-bit unicode_t type can be used when it is necessary to + * represent the full Unicode character set. + */ + +/* Plane-0 Unicode character */ +typedef u16 wchar_t; +#define MAX_WCHAR_T 0xffff + +/* Arbitrary Unicode character */ +typedef u32 unicode_t; + +#endif /* _LINUX_NLS_TYPES_H */ From patchwork Fri Mar 28 19:26:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathan Chancellor X-Patchwork-Id: 14032376 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F6FC1E5710; Fri, 28 Mar 2025 19:26:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743190011; cv=none; b=XJNpHPANzTnWDDI8bUomRce07fhdA8B+UhqZhVcHoBGLdNB4Zn39vBrImw5osr1JsaQeQfCb8pjf2tzbQoBDah+Sdws7dL4FOMiqk0Sb7S2AMIH4uBw9hNoAg0tZUiNIvn7s4e1K1tkPpedJs1BRF0YQKCXj4AaaQpJxK84bP/0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743190011; c=relaxed/simple; bh=PkGUDCBwQo6CjP1Y3kpo+26fhidFS6GR+HAnWO8Fc0Y=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ciaFYQ8hk2V2gY6DzmLceLvWtOfrMRdCDAsVf3GGzrgIkrKvFyyEWddp0XiFde7o4oXLk8OQGomH4LMXFhsNmv+uD2k7P2ETIrbVUl4UWEC2dsjcUcQLH8h/T9jKv5ICPUW2U2VAuMe+yyPiX1My4c2wJiONNEPAQirhQntzdzQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=AdrLdeM5; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="AdrLdeM5" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 48600C4CEE8; Fri, 28 Mar 2025 19:26:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743190011; bh=PkGUDCBwQo6CjP1Y3kpo+26fhidFS6GR+HAnWO8Fc0Y=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=AdrLdeM5ZqfmdqdsCR0RUN1KCYuR8FAWlpYqC3dRqRRh0vSLTwj4nVm1puUXn/xqS vcrm9G/MqnsMFumg4oQYiwo5KRBscvlgaNcYogmBsH9kkvKXz7n0TRV59x0ziRHTDw lVqdMxXWDS0aET2+sb5HpsFOKgoJYET/0zKnFG6XUvsSWzZN0vTQhyX8q54MLUthxD ZsaafVbMzt9SuZIH83RdxX+TAShOCORLhMAaOMHp9MlJ0NwfGAwyqSZslyzWGjqz7Y HxmJZXn/v/44vSv4IV2JhTE6cJ1kyzqmim9+eOGYu2vw4BU2/TQaS+ANWjundl/k2e TLPF98d5iuDXw== From: Nathan Chancellor Date: Fri, 28 Mar 2025 12:26:32 -0700 Subject: [PATCH v3 2/2] lib/string.c: Add wcslen() Precedence: bulk X-Mailing-List: linux-hardening@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250328-string-add-wcslen-for-llvm-opt-v3-2-a180b4c0c1c4@kernel.org> References: <20250328-string-add-wcslen-for-llvm-opt-v3-0-a180b4c0c1c4@kernel.org> In-Reply-To: <20250328-string-add-wcslen-for-llvm-opt-v3-0-a180b4c0c1c4@kernel.org> To: Kees Cook Cc: Andy Shevchenko , Nick Desaulniers , Bill Wendling , Justin Stitt , linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, llvm@lists.linux.dev, stable@vger.kernel.org, Nathan Chancellor X-Mailer: b4 0.15-dev X-Developer-Signature: v=1; a=openpgp-sha256; l=3056; i=nathan@kernel.org; h=from:subject:message-id; bh=PkGUDCBwQo6CjP1Y3kpo+26fhidFS6GR+HAnWO8Fc0Y=; b=owGbwMvMwCUmm602sfCA1DTG02pJDOnPvn/5ce7nCeFZFl8TPaKMXlscn3zmy74tFp83MEfN+ WnA1bjnZ0cpC4MYF4OsmCJL9WPV44aGc84y3jg1CWYOKxPIEAYuTgGYiFAEwx+OXy7vHP7ulfl7 u3z+xLYTQT99UpK3zNJ0+MRmVnxAwGAfw1/hrn2blhwVO7F884ro83s9NF25C7Iis/e+K3e++JF HpJgVAA== X-Developer-Key: i=nathan@kernel.org; a=openpgp; fpr=2437CB76E544CB6AB3D9DFD399739260CB6CB716 A recent optimization change in LLVM [1] aims to transform certain loop idioms into calls to strlen() or wcslen(). This change transforms the first while loop in UniStrcat() into a call to wcslen(), breaking the build when UniStrcat() gets inlined into alloc_path_with_tree_prefix(): ld.lld: error: undefined symbol: wcslen >>> referenced by nls_ucs2_utils.h:54 (fs/smb/client/../../nls/nls_ucs2_utils.h:54) >>> vmlinux.o:(alloc_path_with_tree_prefix) >>> referenced by nls_ucs2_utils.h:54 (fs/smb/client/../../nls/nls_ucs2_utils.h:54) >>> vmlinux.o:(alloc_path_with_tree_prefix) The kernel does not build with '-ffreestanding' (which would avoid this transformation) because it does want libcall optimizations in general and turning on '-ffreestanding' disables the majority of them. While '-fno-builtin-wcslen' would be more targeted at the problem, it does not work with LTO. Add a basic wcslen() to avoid this linkage failure. While no architecture or FORTIFY_SOURCE overrides this, add it to string.c instead of string_helpers.c so that it is built with '-ffreestanding', otherwise the compiler might transform it into a call to itself. Cc: stable@vger.kernel.org Link: https://github.com/llvm/llvm-project/commit/9694844d7e36fd5e01011ab56b64f27b867aa72d [1] Signed-off-by: Nathan Chancellor --- include/linux/string.h | 2 ++ lib/string.c | 11 +++++++++++ 2 files changed, 13 insertions(+) diff --git a/include/linux/string.h b/include/linux/string.h index 0403a4ca4c11..b000f445a2c7 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -10,6 +10,7 @@ #include /* for NULL */ #include /* for ERR_PTR() */ #include /* for E2BIG */ +#include /* for wchar_t */ #include /* for check_mul_overflow() */ #include #include @@ -203,6 +204,7 @@ extern __kernel_size_t strlen(const char *); #ifndef __HAVE_ARCH_STRNLEN extern __kernel_size_t strnlen(const char *,__kernel_size_t); #endif +__kernel_size_t wcslen(const wchar_t *s); #ifndef __HAVE_ARCH_STRPBRK extern char * strpbrk(const char *,const char *); #endif diff --git a/lib/string.c b/lib/string.c index eb4486ed40d2..2c6f8c8f4159 100644 --- a/lib/string.c +++ b/lib/string.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include #include @@ -429,6 +430,16 @@ size_t strnlen(const char *s, size_t count) EXPORT_SYMBOL(strnlen); #endif +size_t wcslen(const wchar_t *s) +{ + const wchar_t *sc; + + for (sc = s; *sc != '\0'; ++sc) + /* nothing */; + return sc - s; +} +EXPORT_SYMBOL(wcslen); + #ifndef __HAVE_ARCH_STRSPN /** * strspn - Calculate the length of the initial substring of @s which only contain letters in @accept