mbox series

[0/2] configs/hardening: Some fixes for UBSAN

Message ID 20240411-fix-ubsan-in-hardening-config-v1-0-e0177c80ffaa@kernel.org (mailing list archive)
Headers show
Series configs/hardening: Some fixes for UBSAN | expand

Message

Nathan Chancellor April 11, 2024, 6:11 p.m. UTC
Hi all,

This series was spurred by a couple of recent UBSAN reports in our
continuous integration that appear to be related to
CONFIG_UBSAN_SIGNED_WRAP (which gets enabled with hardening.config due
to 'default UBSAN'), as they only appear with clang-19 and newer:

  https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/8646488985/job/23709324479#step:6:500
  https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/8646488985/job/23709330815#step:6:651

I'll include the information that I have gathered so far on these
specific instances below but I think that it is debatable whether
CONFIG_UBSAN_SIGNED_WRAP should be enabled by hardening.config at this
point in time, as it does not seem "production ready" to me, given that
there has not been many resources towards getting the majority of
instances cleaned up yet from what I can tell. This is particularly
problematic since hardening.config enables CONFIG_UBSAN_TRAP, so all
instances of this problem will break the kernel at runtime, which does
not seem great to me, hence patch 2. Patch 1 seems rather
uncontroversial to me :)

As for the actual crash itself, which seems like it should still be
addressed, I landed on commit 1211f3b21c2a ("workqueue: Preserve OFFQ
bits in cancel[_sync] paths") in -next for both crashes. Not immediately
obvious to me what it is complaining about though.

  [    0.000000] Linux version 6.9.0-rc1-00001-g1211f3b21c2a (nathan@dev-arch.thelio-3990X) (ClangBuiltLinux clang version 19.0.0git (https://github.com/llvm/llvm-project be10070f91b86a6f126d2451852242bfcb2cd366), ClangBuiltLinux LLD 19.0.0) #1 SMP PREEMPT Thu Apr 11 11:02:26 MST 2024
  ...
  [    0.189542] Internal error: UBSAN: unrecognized failure code: 00000000f2005515 [#1] PREEMPT SMP
  [    0.193125] Modules linked in:
  [    0.193865] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc1-00001-g1211f3b21c2a #1
  [    0.194185] Hardware name: linux,dummy-virt (DT)
  [    0.194464] pstate: 010000c9 (nzcv daIF -PAN -UAO -TCO +DIT -SSBS BTYPE=--)
  [    0.194778] pc : cancel_delayed_work+0x54/0x94
  [    0.195742] lr : cancel_delayed_work+0x40/0x94
  [    0.195877] sp : ffff80008000ba30
  [    0.195990] x29: ffff80008000ba40 x28: 0000000000000000 x27: 0000000000000000
  [    0.196315] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
  [    0.196528] x23: ffff9ce4d84ac000 x22: 0000000000000000 x21: fff000000294b480
  [    0.196746] x20: ffff9ce4d8c5e000 x19: ffff9ce4d8b28c30 x18: ffff80008000d058
  [    0.196955] x17: 0000000000000000 x16: 0000000000000000 x15: dead000000000100
  [    0.197173] x14: 0000000000000001 x13: 0000000000000075 x12: 00000a0000000000
  [    0.197383] x11: fff0000002b10018 x10: 0008b1020000f0ff x9 : 7058149bb97ccd00
  [    0.197619] x8 : 00000000000000e1 x7 : 3d4d455453595342 x6 : 000000004e514553
  [    0.197828] x5 : fff0000002b1026b x4 : fff000001fbdaef0 x3 : 0000000000003400
  [    0.198038] x2 : ffff80008000ba30 x1 : 0000000000000000 x0 : 0000000000000000
  [    0.198326] Call trace:
  [    0.198544]  cancel_delayed_work+0x54/0x94
  [    0.198810]  deferred_probe_extend_timeout+0x20/0x6c
  [    0.198988]  driver_register+0xa8/0x10c
  [    0.199122]  __platform_driver_register+0x28/0x38
  [    0.199258]  tegra194_cbb_init+0x24/0x34
  [    0.199393]  do_one_initcall+0xec/0x2d0
  [    0.199543]  do_initcall_level+0xa4/0xd0
  [    0.199663]  do_initcalls+0x78/0xcc
  [    0.199770]  do_basic_setup+0x24/0x34
  [    0.199880]  kernel_init_freeable+0x110/0x180
  [    0.200014]  kernel_init+0x28/0x1b8
  [    0.200123]  ret_from_fork+0x10/0x20
  [    0.200547] Code: 54ffff60 37f80080 39400268 371001c8 (d42aa2a0) 
  [    0.200996] ---[ end trace 0000000000000000 ]---

---
Nathan Chancellor (2):
      configs/hardening: Fix disabling UBSAN configurations
      configs/hardening: Disable CONFIG_UBSAN_SIGNED_WRAP

 kernel/configs/hardening.config | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)
---
base-commit: fec50db7033ea478773b159e0e2efb135270e3b7
change-id: 20240410-fix-ubsan-in-hardening-config-92f66df06c4e

Best regards,

Comments

Kees Cook April 15, 2024, 6:09 p.m. UTC | #1
On Thu, 11 Apr 2024 11:11:05 -0700, Nathan Chancellor wrote:
> This series was spurred by a couple of recent UBSAN reports in our
> continuous integration that appear to be related to
> CONFIG_UBSAN_SIGNED_WRAP (which gets enabled with hardening.config due
> to 'default UBSAN'), as they only appear with clang-19 and newer:
> 
>   https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/8646488985/job/23709324479#step:6:500
>   https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/8646488985/job/23709330815#step:6:651
> 
> [...]

Applied to for-next/hardening, thanks!

[1/2] configs/hardening: Fix disabling UBSAN configurations
      https://git.kernel.org/kees/c/e048d668f296
[2/2] configs/hardening: Disable CONFIG_UBSAN_SIGNED_WRAP
      https://git.kernel.org/kees/c/7fcb91d94e89

Take care,
Kees Cook April 15, 2024, 6:15 p.m. UTC | #2
On Thu, Apr 11, 2024 at 11:11:05AM -0700, Nathan Chancellor wrote:
>   [    0.189542] Internal error: UBSAN: unrecognized failure code: 00000000f2005515 [#1] PREEMPT SMP

Oops! Yes, I didn't update the (arm64) trap handler to notice integer
overflows. I think I need something like:

diff --git a/lib/ubsan.c b/lib/ubsan.c
index 5fc107f61934..a2fb19f75825 100644
--- a/lib/ubsan.c
+++ b/lib/ubsan.c
@@ -77,6 +77,14 @@ const char *report_ubsan_failure(struct pt_regs *regs, u32 check_type)
 		return "UBSAN: alignment assumption";
 	case ubsan_type_mismatch:
 		return "UBSAN: type mismatch";
+#endif
+#ifdef CONFIG_UBSAN_SIGNED_INTEGER_WRAP
+	case ubsan_add_overflow:
+		return "UBSAN: integer addition overflow";
+	case ubsan_sub_overflow:
+		return "UBSAN: integer subtraction overflow";
+	case ubsan_mul_overflow:
+		return "UBSAN: integer multiplication overflow";
 #endif
 	default:
 		return "UBSAN: unrecognized failure code";

>   [    0.198326] Call trace:
>   [    0.198544]  cancel_delayed_work+0x54/0x94
>   [    0.198810]  deferred_probe_extend_timeout+0x20/0x6c
>   [    0.198988]  driver_register+0xa8/0x10c
>   [    0.199122]  __platform_driver_register+0x28/0x38
>   [    0.199258]  tegra194_cbb_init+0x24/0x34

Justin, does this trace match anything you found running syzkaller
against SIO? (I assume not -- this seems to be a tegra code path...)
Nathan Chancellor April 15, 2024, 6:32 p.m. UTC | #3
On Mon, Apr 15, 2024 at 11:15:05AM -0700, Kees Cook wrote:
> On Thu, Apr 11, 2024 at 11:11:05AM -0700, Nathan Chancellor wrote:
> >   [    0.189542] Internal error: UBSAN: unrecognized failure code: 00000000f2005515 [#1] PREEMPT SMP
> 
> Oops! Yes, I didn't update the (arm64) trap handler to notice integer
> overflows. I think I need something like:
> 
> diff --git a/lib/ubsan.c b/lib/ubsan.c
> index 5fc107f61934..a2fb19f75825 100644
> --- a/lib/ubsan.c
> +++ b/lib/ubsan.c
> @@ -77,6 +77,14 @@ const char *report_ubsan_failure(struct pt_regs *regs, u32 check_type)
>  		return "UBSAN: alignment assumption";
>  	case ubsan_type_mismatch:
>  		return "UBSAN: type mismatch";
> +#endif
> +#ifdef CONFIG_UBSAN_SIGNED_INTEGER_WRAP

This should be CONFIG_UBSAN_SIGNED_WRAP

> +	case ubsan_add_overflow:
> +		return "UBSAN: integer addition overflow";
> +	case ubsan_sub_overflow:
> +		return "UBSAN: integer subtraction overflow";
> +	case ubsan_mul_overflow:
> +		return "UBSAN: integer multiplication overflow";
>  #endif
>  	default:
>  		return "UBSAN: unrecognized failure code";

but with that fixed, it becomes

  [    0.208004] Internal error: UBSAN: integer subtraction overflow: 00000000f2005515 [#1] PREEMPT SMP

so LGTM.

> >   [    0.198326] Call trace:
> >   [    0.198544]  cancel_delayed_work+0x54/0x94
> >   [    0.198810]  deferred_probe_extend_timeout+0x20/0x6c
> >   [    0.198988]  driver_register+0xa8/0x10c
> >   [    0.199122]  __platform_driver_register+0x28/0x38
> >   [    0.199258]  tegra194_cbb_init+0x24/0x34
> 
> Justin, does this trace match anything you found running syzkaller
> against SIO? (I assume not -- this seems to be a tegra code path...)

FWIW, it is also visible with x86_64 in some other driver, so it is
likely not a driver specific issue (at least not to one driver).

[    0.219560] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[    0.220441] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc4-next-20240415-dirty #1
[    0.220441] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[    0.220441] RIP: 0010:cancel_delayed_work+0x23/0x40
[    0.220441] Code: 90 90 90 90 90 90 90 f3 0f 1e fa 53 50 48 89 fb 48 c7 04 24 00 00 00 00 48 89 e2 be 01 00 00 00 e8 02 eb ff ff f6 03 04 75 05 <67> 0f b9 40 15 90 0f 0b 90 67 0f b9 40 15 66 66 66 66 66 66 2e 0f
[    0.220441] RSP: 0018:ffffa51600017b30 EFLAGS: 00010046
[    0.220441] RAX: 0000000000000000 RBX: ffffffff997db830 RCX: 6da351d27a629a00
[    0.220441] RDX: ffffa51600017b01 RSI: 0000000000000000 RDI: ffffffff997db850
[    0.220441] RBP: ffffa51600017ea8 R08: 0000000000000001 R09: 000000000000000a
[    0.220441] R10: 0000410110000001 R11: 0000000000000000 R12: 0000000000000000
[    0.220441] R13: 0000000000000000 R14: ffffffff99f2e474 R15: 0000000000000000
[    0.220441] FS:  0000000000000000(0000) GS:ffff8c485d200000(0000) knlGS:0000000000000000
[    0.220441] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.220441] CR2: ffff8c4854401000 CR3: 000000001382a000 CR4: 0000000000350ef0
[    0.220441] Call Trace:
[    0.220441]  <TASK>
[    0.220441]  ? __die_body+0x65/0xb0
[    0.220441]  ? die+0xa5/0xd0
[    0.220441]  ? do_trap+0xa1/0x170
[    0.220441]  ? cancel_delayed_work+0x23/0x40
[    0.220441]  ? cancel_delayed_work+0x23/0x40
[    0.220441]  ? handle_invalid_op+0x65/0x80
[    0.220441]  ? cancel_delayed_work+0x23/0x40
[    0.220441]  ? exc_invalid_op+0x39/0x50
[    0.220441]  ? asm_exc_invalid_op+0x1a/0x20
[    0.220441]  ? cancel_delayed_work+0x23/0x40
[    0.220441]  deferred_probe_extend_timeout+0x10/0x50
[    0.220441]  driver_register+0xa9/0x110
[    0.220441]  i2c_register_driver+0x40/0xa0
[    0.220441]  i2c_init+0x8b/0xe0
[    0.220441]  ? __pfx_i2c_init+0x10/0x10
[    0.220441]  do_one_initcall+0x110/0x350
[    0.220441]  ? kernfs_xattr_get+0x34/0x70
[    0.220441]  ? srso_return_thunk+0x5/0x5f
[    0.220441]  ? __alloc_pages_noprof+0x186/0x320
[    0.220441]  ? srso_return_thunk+0x5/0x5f
[    0.220441]  ? __get_random_u32_below+0xe/0x60
[    0.220441]  ? srso_return_thunk+0x5/0x5f
[    0.220441]  ? new_slab+0x1d6/0x710
[    0.220441]  ? new_slab+0x1d6/0x710
[    0.220441]  ? srso_return_thunk+0x5/0x5f
[    0.220441]  ? number+0x166/0x530
[    0.220441]  ? srso_return_thunk+0x5/0x5f
[    0.220441]  ? ida_alloc_range+0x459/0x4a0
[    0.220441]  ? srso_return_thunk+0x5/0x5f
[    0.220441]  ? parameq+0x12/0x90
[    0.220441]  ? srso_return_thunk+0x5/0x5f
[    0.220441]  ? srso_return_thunk+0x5/0x5f
[    0.220441]  ? parse_args+0x137/0x340
[    0.220441]  do_initcall_level+0x85/0x100
[    0.220441]  do_initcalls+0x54/0x90
[    0.220441]  kernel_init_freeable+0xf2/0x160
[    0.220441]  ? __pfx_kernel_init+0x10/0x10
[    0.220441]  kernel_init+0x15/0x1b0
[    0.220441]  ret_from_fork+0x35/0x40
[    0.220441]  ? __pfx_kernel_init+0x10/0x10
[    0.220441]  ret_from_fork_asm+0x1a/0x30
[    0.220441]  </TASK>
[    0.220441] Modules linked in:
[    0.220441] ---[ end trace 0000000000000000 ]---
[    0.220441] RIP: 0010:cancel_delayed_work+0x23/0x40
[    0.220441] Code: 90 90 90 90 90 90 90 f3 0f 1e fa 53 50 48 89 fb 48 c7 04 24 00 00 00 00 48 89 e2 be 01 00 00 00 e8 02 eb ff ff f6 03 04 75 05 <67> 0f b9 40 15 90 0f 0b 90 67 0f b9 40 15 66 66 66 66 66 66 2e 0f
[    0.220441] RSP: 0018:ffffa51600017b30 EFLAGS: 00010046
[    0.220441] RAX: 0000000000000000 RBX: ffffffff997db830 RCX: 6da351d27a629a00
[    0.220441] RDX: ffffa51600017b01 RSI: 0000000000000000 RDI: ffffffff997db850
[    0.220441] RBP: ffffa51600017ea8 R08: 0000000000000001 R09: 000000000000000a
[    0.220441] R10: 0000410110000001 R11: 0000000000000000 R12: 0000000000000000
[    0.220441] R13: 0000000000000000 R14: ffffffff99f2e474 R15: 0000000000000000
[    0.220441] FS:  0000000000000000(0000) GS:ffff8c485d200000(0000) knlGS:0000000000000000
[    0.220441] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.220441] CR2: ffff8c4854401000 CR3: 000000001382a000 CR4: 0000000000350ef0

Cheers,
Nathan
Justin Stitt April 22, 2024, 10 p.m. UTC | #4
On Mon, Apr 15, 2024 at 11:15:05AM -0700, Kees Cook wrote:
> On Thu, Apr 11, 2024 at 11:11:05AM -0700, Nathan Chancellor wrote:
> >   [    0.189542] Internal error: UBSAN: unrecognized failure code: 00000000f2005515 [#1] PREEMPT SMP
> 
> Oops! Yes, I didn't update the (arm64) trap handler to notice integer
> overflows. I think I need something like:
> 
> diff --git a/lib/ubsan.c b/lib/ubsan.c
> index 5fc107f61934..a2fb19f75825 100644
> --- a/lib/ubsan.c
> +++ b/lib/ubsan.c
> @@ -77,6 +77,14 @@ const char *report_ubsan_failure(struct pt_regs *regs, u32 check_type)
>  		return "UBSAN: alignment assumption";
>  	case ubsan_type_mismatch:
>  		return "UBSAN: type mismatch";
> +#endif
> +#ifdef CONFIG_UBSAN_SIGNED_INTEGER_WRAP
> +	case ubsan_add_overflow:
> +		return "UBSAN: integer addition overflow";
> +	case ubsan_sub_overflow:
> +		return "UBSAN: integer subtraction overflow";
> +	case ubsan_mul_overflow:
> +		return "UBSAN: integer multiplication overflow";
>  #endif
>  	default:
>  		return "UBSAN: unrecognized failure code";
> 
> >   [    0.198326] Call trace:
> >   [    0.198544]  cancel_delayed_work+0x54/0x94
> >   [    0.198810]  deferred_probe_extend_timeout+0x20/0x6c
> >   [    0.198988]  driver_register+0xa8/0x10c
> >   [    0.199122]  __platform_driver_register+0x28/0x38
> >   [    0.199258]  tegra194_cbb_init+0x24/0x34
> 
> Justin, does this trace match anything you found running syzkaller
> against SIO? (I assume not -- this seems to be a tegra code path...)

Nope, here's a full list of the SIO (just signed-IO, not unsigned-IO)
crashes I encountered with about 10 days of syzkaller

title|frequency*|date|repro
UBSAN: signed-integer-overflow in __do_adjtimex	100	2024/03/13 08:54	has C repro
UBSAN: signed-integer-overflow in __gup_longterm_locked	1	2024/03/13 00:48	
UBSAN: signed-integer-overflow in accumulate_nsecs_to_secs	7	2024/03/11 23:35	has C repro
UBSAN: signed-integer-overflow in ata1	3	2024/03/11 12:45	
UBSAN: signed-integer-overflow in blkpg_do_ioctl	100	2024/03/13 07:53	has C repro
UBSAN: signed-integer-overflow in cdrom_ioctl	100	2024/03/13 08:31	has C repro
UBSAN: signed-integer-overflow in corrupted	10	2024/03/12 08:03	
UBSAN: signed-integer-overflow in dcache_dir_lseek	10	2024/03/13 07:55	has C repro
UBSAN: signed-integer-overflow in do_io_getevents	38	2024/03/13 07:59	has C repro
UBSAN: signed-integer-overflow in done	4	2024/03/05 22:31	
UBSAN: signed-integer-overflow in generic_file_llseek_size	100	2024/03/13 09:04	has C repro
UBSAN: signed-integer-overflow in hugetlbfs_fallocate	1	2024/03/01 14:29	has C repro
UBSAN: signed-integer-overflow in init_file	100	2024/03/13 07:47	has C repro
UBSAN: signed-integer-overflow in ioctl_preallocate	95	2024/03/13 01:33	has C repro
UBSAN: signed-integer-overflow in scrollfront	31	2024/03/13 06:16	has C repro
UBSAN: signed-integer-overflow in seq_lseek	100	2024/03/13 08:29	has C repro
UBSAN: signed-integer-overflow in sr_select_speed	100	2024/03/13 08:26	has C repro
UBSAN: signed-integer-overflow in sync_file_range	100	2024/03/13 08:09	has C repro
UBSAN: signed-integer-overflow in timekeeping_inject_offset	100	2024/03/13 07:57	has C repro
UBSAN: signed-integer-overflow in udpv6_sendmsg	25	2024/03/13 07:12	has C repro
UBSAN: signed-integer-overflow in vfs_copy_file_range	100	2024/03/13 08:51	has C repro
UBSAN: signed-integer-overflow in vfs_fallocate	100	2024/03/13 08:24	has C repro


*duplicate crashes past 100 are not reported or attempted to be
reproduced.

I don't believe any of these match the trace Nathan reported.

> 
> -- 
> Kees Cook