diff mbox series

[RFC] xfrm: work around a clang-19 fortifiy-string false-positive

Message ID 20240216202657.2493685-1-arnd@kernel.org (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series [RFC] xfrm: work around a clang-19 fortifiy-string false-positive | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5224 this patch: 5224
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 11 of 11 maintainers
netdev/build_clang success Errors and warnings before: 1085 this patch: 1085
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 5531 this patch: 5531
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 35 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-02-17--03-00 (tests: 1445)

Commit Message

Arnd Bergmann Feb. 16, 2024, 8:26 p.m. UTC
From: Arnd Bergmann <arnd@arndb.de>

clang-19 recently got branched from clang-18 and is not yet released.
The current version introduces exactly one new warning that I came
across in randconfig testing, in the copy_to_user_tmpl() function:

include/linux/fortify-string.h:420:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
  420 |                         __write_overflow_field(p_size_field, size);

I have not yet produced a minimized test case for it, but I have a
local workaround, which avoids the memset() here by replacing it with
an initializer.

The memset is required to avoid leaking stack data to user space
and was added in commit 1f86840f8977 ("xfrm_user: fix info leak in
copy_to_user_tmpl()"). Simply changing the initializer to set all fields
still risks leaking data in the padding between them, which the compiler
is free to do here. To work around that problem, explicit padding fields
have to get added as well.

My first idea was that just adding the padding would avoid the warning
as well, as the padding tends to confused the fortified string helpers,
but it turns out that both changes are required here.

Since this is a false positive, a better fix would likely be to
fix the compiler.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 include/uapi/linux/xfrm.h | 3 +++
 net/xfrm/xfrm_user.c      | 3 +--
 2 files changed, 4 insertions(+), 2 deletions(-)

Comments

Nathan Chancellor Feb. 16, 2024, 8:42 p.m. UTC | #1
Hi Arnd,

On Fri, Feb 16, 2024 at 09:26:40PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> clang-19 recently got branched from clang-18 and is not yet released.
> The current version introduces exactly one new warning that I came
> across in randconfig testing, in the copy_to_user_tmpl() function:
> 
> include/linux/fortify-string.h:420:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
>   420 |                         __write_overflow_field(p_size_field, size);
> 
> I have not yet produced a minimized test case for it, but I have a
> local workaround, which avoids the memset() here by replacing it with
> an initializer.
> 
> The memset is required to avoid leaking stack data to user space
> and was added in commit 1f86840f8977 ("xfrm_user: fix info leak in
> copy_to_user_tmpl()"). Simply changing the initializer to set all fields
> still risks leaking data in the padding between them, which the compiler
> is free to do here. To work around that problem, explicit padding fields
> have to get added as well.
> 
> My first idea was that just adding the padding would avoid the warning
> as well, as the padding tends to confused the fortified string helpers,
> but it turns out that both changes are required here.
> 
> Since this is a false positive, a better fix would likely be to
> fix the compiler.

I have some observations and notes from my initial investigation into
this issue on our GitHub issue tracker but I have not produced a
minimized test case either.

https://github.com/ClangBuiltLinux/linux/issues/1985

> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  include/uapi/linux/xfrm.h | 3 +++
>  net/xfrm/xfrm_user.c      | 3 +--
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/xfrm.h b/include/uapi/linux/xfrm.h
> index 6a77328be114..99adac4fa648 100644
> --- a/include/uapi/linux/xfrm.h
> +++ b/include/uapi/linux/xfrm.h
> @@ -27,6 +27,7 @@ struct xfrm_id {
>  	xfrm_address_t	daddr;
>  	__be32		spi;
>  	__u8		proto;
> +	__u8		__pad[3];
>  };
>  
>  struct xfrm_sec_ctx {
> @@ -242,11 +243,13 @@ struct xfrm_user_sec_ctx {
>  struct xfrm_user_tmpl {
>  	struct xfrm_id		id;
>  	__u16			family;
> +	__u16			__pad1;
>  	xfrm_address_t		saddr;
>  	__u32			reqid;
>  	__u8			mode;
>  	__u8			share;
>  	__u8			optional;
> +	__u8			__pad2;
>  	__u32			aalgos;
>  	__u32			ealgos;
>  	__u32			calgos;
> diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
> index a5232dcfea46..e81f977e183c 100644
> --- a/net/xfrm/xfrm_user.c
> +++ b/net/xfrm/xfrm_user.c
> @@ -2011,7 +2011,7 @@ static int xfrm_add_policy(struct sk_buff *skb, struct nlmsghdr *nlh,
>  
>  static int copy_to_user_tmpl(struct xfrm_policy *xp, struct sk_buff *skb)
>  {
> -	struct xfrm_user_tmpl vec[XFRM_MAX_DEPTH];
> +	struct xfrm_user_tmpl vec[XFRM_MAX_DEPTH] = {};
>  	int i;
>  
>  	if (xp->xfrm_nr == 0)
> @@ -2021,7 +2021,6 @@ static int copy_to_user_tmpl(struct xfrm_policy *xp, struct sk_buff *skb)
>  		struct xfrm_user_tmpl *up = &vec[i];
>  		struct xfrm_tmpl *kp = &xp->xfrm_vec[i];
>  
> -		memset(up, 0, sizeof(*up));
>  		memcpy(&up->id, &kp->id, sizeof(up->id));
>  		up->family = kp->encap_family;
>  		memcpy(&up->saddr, &kp->saddr, sizeof(up->saddr));
> -- 
> 2.39.2
>
Kees Cook Feb. 16, 2024, 9:19 p.m. UTC | #2
On Fri, Feb 16, 2024 at 09:26:40PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> clang-19 recently got branched from clang-18 and is not yet released.
> The current version introduces exactly one new warning that I came
> across in randconfig testing, in the copy_to_user_tmpl() function:
> 
> include/linux/fortify-string.h:420:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
>   420 |                         __write_overflow_field(p_size_field, size);
> 
> I have not yet produced a minimized test case for it, but I have a
> local workaround, which avoids the memset() here by replacing it with
> an initializer.
> 
> The memset is required to avoid leaking stack data to user space
> and was added in commit 1f86840f8977 ("xfrm_user: fix info leak in
> copy_to_user_tmpl()"). Simply changing the initializer to set all fields
> still risks leaking data in the padding between them, which the compiler
> is free to do here. To work around that problem, explicit padding fields
> have to get added as well.

Per C11, padding bits are zero initialized if there is an initializer,
so "= { }" here should be sufficient -- no need to add the struct
members.

> Since this is a false positive, a better fix would likely be to
> fix the compiler.

As Nathan has found, this appears to be a loop unrolling bug in Clang.
https://github.com/ClangBuiltLinux/linux/issues/1985

The shorter fix (in the issue) is to explicitly range-check before
the loop:

       if (xp->xfrm_nr > XFRM_MAX_DEPTH)
               return -ENOBUFS;
Arnd Bergmann April 8, 2024, 7:06 a.m. UTC | #3
On Fri, Feb 16, 2024, at 22:19, Kees Cook wrote:
> On Fri, Feb 16, 2024 at 09:26:40PM +0100, Arnd Bergmann wrote:
>> From: Arnd Bergmann <arnd@arndb.de>
>> 
>> clang-19 recently got branched from clang-18 and is not yet released.
>> The current version introduces exactly one new warning that I came
>> across in randconfig testing, in the copy_to_user_tmpl() function:
>> 
>> include/linux/fortify-string.h:420:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
>>   420 |                         __write_overflow_field(p_size_field, size);
>> 
>> I have not yet produced a minimized test case for it, but I have a
>> local workaround, which avoids the memset() here by replacing it with
>> an initializer.
>> 
>> The memset is required to avoid leaking stack data to user space
>> and was added in commit 1f86840f8977 ("xfrm_user: fix info leak in
>> copy_to_user_tmpl()"). Simply changing the initializer to set all fields
>> still risks leaking data in the padding between them, which the compiler
>> is free to do here. To work around that problem, explicit padding fields
>> have to get added as well.
>
> Per C11, padding bits are zero initialized if there is an initializer,
> so "= { }" here should be sufficient -- no need to add the struct
> members.
>
>> Since this is a false positive, a better fix would likely be to
>> fix the compiler.
>
> As Nathan has found, this appears to be a loop unrolling bug in Clang.
> https://github.com/ClangBuiltLinux/linux/issues/1985
>
> The shorter fix (in the issue) is to explicitly range-check before
> the loop:
>
>        if (xp->xfrm_nr > XFRM_MAX_DEPTH)
>                return -ENOBUFS;

I ran into this issue again and I see that Nathan's fix has
made it into mainline and backports, but it's apparently
not sufficient.

I don't see the warning with my patch from this thread, but
there may still be a better fix.

      Arnd
Nathan Chancellor April 9, 2024, 4:15 p.m. UTC | #4
On Mon, Apr 08, 2024 at 09:06:21AM +0200, Arnd Bergmann wrote:
> On Fri, Feb 16, 2024, at 22:19, Kees Cook wrote:
> > On Fri, Feb 16, 2024 at 09:26:40PM +0100, Arnd Bergmann wrote:
> >> From: Arnd Bergmann <arnd@arndb.de>
> >> 
> >> clang-19 recently got branched from clang-18 and is not yet released.
> >> The current version introduces exactly one new warning that I came
> >> across in randconfig testing, in the copy_to_user_tmpl() function:
> >> 
> >> include/linux/fortify-string.h:420:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
> >>   420 |                         __write_overflow_field(p_size_field, size);
> >> 
> >> I have not yet produced a minimized test case for it, but I have a
> >> local workaround, which avoids the memset() here by replacing it with
> >> an initializer.
> >> 
> >> The memset is required to avoid leaking stack data to user space
> >> and was added in commit 1f86840f8977 ("xfrm_user: fix info leak in
> >> copy_to_user_tmpl()"). Simply changing the initializer to set all fields
> >> still risks leaking data in the padding between them, which the compiler
> >> is free to do here. To work around that problem, explicit padding fields
> >> have to get added as well.
> >
> > Per C11, padding bits are zero initialized if there is an initializer,
> > so "= { }" here should be sufficient -- no need to add the struct
> > members.
> >
> >> Since this is a false positive, a better fix would likely be to
> >> fix the compiler.
> >
> > As Nathan has found, this appears to be a loop unrolling bug in Clang.
> > https://github.com/ClangBuiltLinux/linux/issues/1985
> >
> > The shorter fix (in the issue) is to explicitly range-check before
> > the loop:
> >
> >        if (xp->xfrm_nr > XFRM_MAX_DEPTH)
> >                return -ENOBUFS;
> 
> I ran into this issue again and I see that Nathan's fix has
> made it into mainline and backports, but it's apparently
> not sufficient.
> 
> I don't see the warning with my patch from this thread, but
> there may still be a better fix.

Is it the exact same warning? clang-19 or older? What
architecture/configuration? If my change is not sufficient then maybe
there are two separate issues here? I have not seen this warning appear
in our CI since my change was applied.

Cheers,
Nathan
Arnd Bergmann April 9, 2024, 7:41 p.m. UTC | #5
On Tue, Apr 9, 2024, at 18:15, Nathan Chancellor wrote:
> On Mon, Apr 08, 2024 at 09:06:21AM +0200, Arnd Bergmann wrote:
>> >
>> > The shorter fix (in the issue) is to explicitly range-check before
>> > the loop:
>> >
>> >        if (xp->xfrm_nr > XFRM_MAX_DEPTH)
>> >                return -ENOBUFS;
>> 
>> I ran into this issue again and I see that Nathan's fix has
>> made it into mainline and backports, but it's apparently
>> not sufficient.
>> 
>> I don't see the warning with my patch from this thread, but
>> there may still be a better fix.
>
> Is it the exact same warning? clang-19 or older?
> What > architecture/configuration? If my change is not sufficient then maybe
> there are two separate issues here? I have not seen this warning appear
> in our CI since my change was applied.

I only see it with clang-19. I've never seen it with arm32 and
currently only see it with arm64, though I had seen it with x86-64
as well in February before your patch.

The warning is the same as before aside from the line number,
which which is now include/linux/fortify-string.h:462:4
where it was line 420, but I think that is just a context
change.

I have a number of configs that reproduce this bug, see
https://pastebin.com/tMgfD7cu for an example with current
linux-next.

     Arnd
Nathan Chancellor April 10, 2024, 5:45 p.m. UTC | #6
On Tue, Apr 09, 2024 at 09:41:09PM +0200, Arnd Bergmann wrote:
> On Tue, Apr 9, 2024, at 18:15, Nathan Chancellor wrote:
> > On Mon, Apr 08, 2024 at 09:06:21AM +0200, Arnd Bergmann wrote:
> >> >
> >> > The shorter fix (in the issue) is to explicitly range-check before
> >> > the loop:
> >> >
> >> >        if (xp->xfrm_nr > XFRM_MAX_DEPTH)
> >> >                return -ENOBUFS;
> >> 
> >> I ran into this issue again and I see that Nathan's fix has
> >> made it into mainline and backports, but it's apparently
> >> not sufficient.
> >> 
> >> I don't see the warning with my patch from this thread, but
> >> there may still be a better fix.
> >
> > Is it the exact same warning? clang-19 or older?
> > What > architecture/configuration? If my change is not sufficient then maybe
> > there are two separate issues here? I have not seen this warning appear
> > in our CI since my change was applied.
> 
> I only see it with clang-19. I've never seen it with arm32 and
> currently only see it with arm64, though I had seen it with x86-64
> as well in February before your patch.

That seems to line up with my prior experience.

> The warning is the same as before aside from the line number,
> which which is now include/linux/fortify-string.h:462:4
> where it was line 420, but I think that is just a context
> change.
> 
> I have a number of configs that reproduce this bug, see
> https://pastebin.com/tMgfD7cu for an example with current
> linux-next.

Thanks for that. I was able to reproduce this on next-20240410 as well
and I reduced the necessary configurations needed to reproduce this on
top of just defconfig:

  $ echo 'CONFIG_FORTIFY_SOURCE=y
  CONFIG_KASAN=y
  CONFIG_XFRM_USER=y' >arch/arm64/configs/repro.config

  $ make -skj"$(nproc)" ARCH=arm64 LLVM=1 {def,repro.}config net/xfrm/xfrm_user.o
  In file included from net/xfrm/xfrm_user.c:14:
  In file included from include/linux/compat.h:10:
  In file included from include/linux/time.h:60:
  In file included from include/linux/time32.h:13:
  In file included from include/linux/timex.h:67:
  In file included from arch/arm64/include/asm/timex.h:8:
  In file included from arch/arm64/include/asm/arch_timer.h:12:
  In file included from arch/arm64/include/asm/hwcap.h:9:
  In file included from arch/arm64/include/asm/cpufeature.h:27:
  In file included from include/linux/cpumask.h:13:
  In file included from include/linux/bitmap.h:13:
  In file included from include/linux/string.h:371:
  include/linux/fortify-string.h:462:4: warning: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
    462 |                         __write_overflow_field(p_size_field, size);
        |                         ^
  1 warning generated.

Unfortunately, I have no idea why it is complaining nor why your patch
resolves it but the combination of FORTIFY_SOURCE and KASAN certainly
seems like a reasonable place to start looking. I will see if I can come
up with a smaller reproducer to see if it becomes more obvious why this
code triggers this warning.

Cheers,
Nathan
Arnd Bergmann April 11, 2024, 11:35 a.m. UTC | #7
On Wed, Apr 10, 2024, at 19:45, Nathan Chancellor wrote:

> Unfortunately, I have no idea why it is complaining nor why your patch
> resolves it but the combination of FORTIFY_SOURCE and KASAN certainly
> seems like a reasonable place to start looking. I will see if I can come
> up with a smaller reproducer to see if it becomes more obvious why this
> code triggers this warning.

I know at least why my patch avoids the warning -- it removes the
call to memset() that contains the check. Unfortunately that still
doesn't explain what caused it.

      Arnd
Nathan Chancellor April 12, 2024, 9:21 p.m. UTC | #8
On Thu, Apr 11, 2024 at 01:35:05PM +0200, Arnd Bergmann wrote:
> On Wed, Apr 10, 2024, at 19:45, Nathan Chancellor wrote:
> 
> > Unfortunately, I have no idea why it is complaining nor why your patch
> > resolves it but the combination of FORTIFY_SOURCE and KASAN certainly
> > seems like a reasonable place to start looking. I will see if I can come
> > up with a smaller reproducer to see if it becomes more obvious why this
> > code triggers this warning.
> 
> I know at least why my patch avoids the warning -- it removes the
> call to memset() that contains the check.

Yeah duh... :/ I should have realized that before I sent that message
heh.

> Unfortunately that still doesn't explain what caused it.

Right, I'll see if I can cvise something out now that we have a more
isolated set of conditions. I guess the only question will be if I can
build a file preprocessed with CONFIG_KASAN=y will build with and
without '-fsanitize=kernel-address'...

Cheers,
Nathan
diff mbox series

Patch

diff --git a/include/uapi/linux/xfrm.h b/include/uapi/linux/xfrm.h
index 6a77328be114..99adac4fa648 100644
--- a/include/uapi/linux/xfrm.h
+++ b/include/uapi/linux/xfrm.h
@@ -27,6 +27,7 @@  struct xfrm_id {
 	xfrm_address_t	daddr;
 	__be32		spi;
 	__u8		proto;
+	__u8		__pad[3];
 };
 
 struct xfrm_sec_ctx {
@@ -242,11 +243,13 @@  struct xfrm_user_sec_ctx {
 struct xfrm_user_tmpl {
 	struct xfrm_id		id;
 	__u16			family;
+	__u16			__pad1;
 	xfrm_address_t		saddr;
 	__u32			reqid;
 	__u8			mode;
 	__u8			share;
 	__u8			optional;
+	__u8			__pad2;
 	__u32			aalgos;
 	__u32			ealgos;
 	__u32			calgos;
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index a5232dcfea46..e81f977e183c 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -2011,7 +2011,7 @@  static int xfrm_add_policy(struct sk_buff *skb, struct nlmsghdr *nlh,
 
 static int copy_to_user_tmpl(struct xfrm_policy *xp, struct sk_buff *skb)
 {
-	struct xfrm_user_tmpl vec[XFRM_MAX_DEPTH];
+	struct xfrm_user_tmpl vec[XFRM_MAX_DEPTH] = {};
 	int i;
 
 	if (xp->xfrm_nr == 0)
@@ -2021,7 +2021,6 @@  static int copy_to_user_tmpl(struct xfrm_policy *xp, struct sk_buff *skb)
 		struct xfrm_user_tmpl *up = &vec[i];
 		struct xfrm_tmpl *kp = &xp->xfrm_vec[i];
 
-		memset(up, 0, sizeof(*up));
 		memcpy(&up->id, &kp->id, sizeof(up->id));
 		up->family = kp->encap_family;
 		memcpy(&up->saddr, &kp->saddr, sizeof(up->saddr));