diff mbox series

[REGRESSION,BISECTED] erroneous buffer overflow detected in bch2_xattr_validate

Message ID ZvV6X5FPBBW7CO1f@archlinux (mailing list archive)
State New
Headers show
Series [REGRESSION,BISECTED] erroneous buffer overflow detected in bch2_xattr_validate | expand

Commit Message

Jan Hendrik Farr Sept. 26, 2024, 3:14 p.m. UTC
Hi Kent,

found a strange regression in the patch set for 6.12.

First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
bcachefs: Annotate struct bch_xattr with __counted_by()

When compiling with clang 18.1.8 (also with latest llvm main branch) and
CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
detection of a buffer overflow.

The __counted_by attribute is supposed to be supported starting with gcc 15,
not sure if it is implemented yet so I haven't tested with gcc trunk yet.

Here's the relevant section of dmesg:

[    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
[    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
[    6.252374] ------------[ cut here ]------------
[    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
[    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
[    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
[    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
[    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
[    6.252407] RIP: 0010:__fortify_report+0x45/0x50
[    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
[    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
[    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
[    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
[    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
[    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
[    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
[    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
[    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
[    6.252420] PKRU: 55555554
[    6.252421] Call Trace:
[    6.252423]  <TASK>
[    6.252425]  ? __warn+0xd5/0x1d0
[    6.252427]  ? __fortify_report+0x45/0x50
[    6.252429]  ? report_bug+0x144/0x1f0
[    6.252431]  ? __fortify_report+0x45/0x50
[    6.252433]  ? handle_bug+0x6a/0x90
[    6.252435]  ? exc_invalid_op+0x1a/0x50
[    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
[    6.252440]  ? __fortify_report+0x45/0x50
[    6.252441]  __fortify_panic+0x9/0x10
[    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
[    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
[    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
[    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
[    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
[    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
[    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
[    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]

...


The memchr in question is at:
https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99

There is not actually a buffer overflow here, I checked with gdb that
xattr.v->x_name does actually contain a string of the correct length and
xattr.v->x_name_len contains the correct length and should be used to determine
the length when memchr uses __struct_size for bounds-checking due to the
__counted_by annotation.

I'm at the point where I think this is probably a bug in clang. I have a patch
that does fix (more like bandaid) the problem and adds some print statements:

--
 --


Making memchr access via a pointer created with
const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
__struct_size(v->x_name) correctly returns 10 in this case (the value of
x_name_len).

The generated assembly illustrates what is going wrong. Below is an excerpt
of the assembly clang generated for the bch2_xattr_validate function:

	mov	r13d, ecx
	mov	r15, rdi
	mov	r14, rsi
	mov	rdi, offset .L.str.3
	mov	rsi, offset .L__func__.bch2_xattr_validate
	mov	rbx, rdx
	mov	edx, eax
	call	_printk
	movzx	edx, byte ptr [rbx + 1]
	mov	rdi, offset .L.str.4
	mov	rsi, offset .L__func__.bch2_xattr_validate
	call	_printk
	movzx	edx, bh
	mov	rdi, offset .L.str.4
	mov	rsi, offset .L__func__.bch2_xattr_validate
	call	_printk
	lea	rdi, [rbx + 4]
	mov	r12, rbx
	movzx	edx, byte ptr [rbx + 1]
	xor	ebx, ebx
	xor	esi, esi
	call	memchr

At the start of this rdx contains k.v (and is moved into rbx). The three calls
to printk are the ones you can see in my patch. You can see that for the
print that uses __struct_size(v->x_name) the compiler correctly uses
	movzx	edx, byte ptr [rbx + 1]
to load x_name_len into edx.

For the printk call that uses __struct_size(xattr.v->x_name) however the
compiler uses
	movzx	edx, bh
So it will print the high 8 bits of the lower 16 bits (second least
significant byte) of the memory address of xattr.v->x_type. This is obviously
completely wrong.

It is then doing the correct call of memchr because this is using my patch.
Without my patch it would be doing the same thing for the call to memchr where
it uses the second least significant byte of the memory address of x_type as the
length used for the bounds-check.



The LLVM IR also shows the same problem:

define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
  [...]
  %51 = ptrtoint ptr %2 to i64
  %52 = lshr i64 %51, 8
  %53 = and i64 %52, 255

This is the IR for the incorrect behavior. It simply converts the pointer to an
int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
instead of ptrtoint this would actually work, as the second least significant
bit of an i64 loaded from that memory address does contain the value of
x_name_len. It's as if clang forgot to dereference a pointer here.

Correct IR does this (for the other printk invocation):

define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
  [...]
  %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
  %5 = load i8, ptr %4, align 8
  [...]
  %48 = load i8, ptr %5, align 4
  %49 = zext i8 %48 to i64

Best Regards
Jan

Comments

Thorsten Blum Sept. 26, 2024, 3:28 p.m. UTC | #1
Hi Jan,

On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> 
> Hi Kent,
> 
> found a strange regression in the patch set for 6.12.
> 
> First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
> bcachefs: Annotate struct bch_xattr with __counted_by()
> 
> When compiling with clang 18.1.8 (also with latest llvm main branch) and
> CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
> detection of a buffer overflow.
> 
> The __counted_by attribute is supposed to be supported starting with gcc 15,
> not sure if it is implemented yet so I haven't tested with gcc trunk yet.
> 
> Here's the relevant section of dmesg:
> 
> [    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
> [    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
> [    6.252374] ------------[ cut here ]------------
> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> [    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
> [    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
> [    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
> [    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
> [    6.252407] RIP: 0010:__fortify_report+0x45/0x50
> [    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
> [    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
> [    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
> [    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
> [    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
> [    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
> [    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
> [    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
> [    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
> [    6.252420] PKRU: 55555554
> [    6.252421] Call Trace:
> [    6.252423]  <TASK>
> [    6.252425]  ? __warn+0xd5/0x1d0
> [    6.252427]  ? __fortify_report+0x45/0x50
> [    6.252429]  ? report_bug+0x144/0x1f0
> [    6.252431]  ? __fortify_report+0x45/0x50
> [    6.252433]  ? handle_bug+0x6a/0x90
> [    6.252435]  ? exc_invalid_op+0x1a/0x50
> [    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
> [    6.252440]  ? __fortify_report+0x45/0x50
> [    6.252441]  __fortify_panic+0x9/0x10
> [    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> [    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> [    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> [    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> [    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
> [    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
> [    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
> [    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> 
> ...
> 
> 
> The memchr in question is at:
> https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99
> 
> There is not actually a buffer overflow here, I checked with gdb that
> xattr.v->x_name does actually contain a string of the correct length and
> xattr.v->x_name_len contains the correct length and should be used to determine
> the length when memchr uses __struct_size for bounds-checking due to the
> __counted_by annotation.
> 
> I'm at the point where I think this is probably a bug in clang. I have a patch
> that does fix (more like bandaid) the problem and adds some print statements:
> 
> --
> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> index 56c8d3fe55a4..8d7e749b7dda 100644
> --- a/fs/bcachefs/xattr.c
> +++ b/fs/bcachefs/xattr.c
> @@ -74,6 +74,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
>       enum bch_validate_flags flags)
> {
> struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
> + const struct bch_xattr *v = (void *)k.v;
> unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
>   le16_to_cpu(xattr.v->x_val_len));
> int ret = 0;
> @@ -94,9 +95,12 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> 
> bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
> c, xattr_invalid_type,
> - "invalid type (%u)", xattr.v->x_type);
> + "invalid type (%u)", v->x_type);
> 
> - bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> + pr_info("x_name_len: %d", v->x_name_len);
> + pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
> + pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
> + bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
> c, xattr_name_invalid_chars,
> "xattr name has invalid characters");
> fsck_err:
> --
> 
> 
> Making memchr access via a pointer created with
> const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
> can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
> __struct_size(v->x_name) correctly returns 10 in this case (the value of
> x_name_len).
> 
> The generated assembly illustrates what is going wrong. Below is an excerpt
> of the assembly clang generated for the bch2_xattr_validate function:
> 
> mov r13d, ecx
> mov r15, rdi
> mov r14, rsi
> mov rdi, offset .L.str.3
> mov rsi, offset .L__func__.bch2_xattr_validate
> mov rbx, rdx
> mov edx, eax
> call _printk
> movzx edx, byte ptr [rbx + 1]
> mov rdi, offset .L.str.4
> mov rsi, offset .L__func__.bch2_xattr_validate
> call _printk
> movzx edx, bh
> mov rdi, offset .L.str.4
> mov rsi, offset .L__func__.bch2_xattr_validate
> call _printk
> lea rdi, [rbx + 4]
> mov r12, rbx
> movzx edx, byte ptr [rbx + 1]
> xor ebx, ebx
> xor esi, esi
> call memchr
> 
> At the start of this rdx contains k.v (and is moved into rbx). The three calls
> to printk are the ones you can see in my patch. You can see that for the
> print that uses __struct_size(v->x_name) the compiler correctly uses
> movzx edx, byte ptr [rbx + 1]
> to load x_name_len into edx.
> 
> For the printk call that uses __struct_size(xattr.v->x_name) however the
> compiler uses
> movzx edx, bh
> So it will print the high 8 bits of the lower 16 bits (second least
> significant byte) of the memory address of xattr.v->x_type. This is obviously
> completely wrong.
> 
> It is then doing the correct call of memchr because this is using my patch.
> Without my patch it would be doing the same thing for the call to memchr where
> it uses the second least significant byte of the memory address of x_type as the
> length used for the bounds-check.
> 
> 
> 
> The LLVM IR also shows the same problem:
> 
> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
>  [...]
>  %51 = ptrtoint ptr %2 to i64
>  %52 = lshr i64 %51, 8
>  %53 = and i64 %52, 255
> 
> This is the IR for the incorrect behavior. It simply converts the pointer to an
> int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
> instead of ptrtoint this would actually work, as the second least significant
> bit of an i64 loaded from that memory address does contain the value of
> x_name_len. It's as if clang forgot to dereference a pointer here.
> 
> Correct IR does this (for the other printk invocation):
> 
> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
>  [...]
>  %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
>  %5 = load i8, ptr %4, align 8
>  [...]
>  %48 = load i8, ptr %5, align 4
>  %49 = zext i8 %48 to i64
> 
> Best Regards
> Jan

I suspect it's the same Clang __bdos() "bug" as in [1] and [2].

[1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
[2] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
Thorsten Blum Sept. 26, 2024, 4:09 p.m. UTC | #2
On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
>> 
>> Hi Kent,
>> 
>> found a strange regression in the patch set for 6.12.
>> 
>> First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
>> bcachefs: Annotate struct bch_xattr with __counted_by()
>> 
>> When compiling with clang 18.1.8 (also with latest llvm main branch) and
>> CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
>> detection of a buffer overflow.
>> 
>> The __counted_by attribute is supposed to be supported starting with gcc 15,
>> not sure if it is implemented yet so I haven't tested with gcc trunk yet.
>> 
>> Here's the relevant section of dmesg:
>> 
>> [    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
>> [    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
>> [    6.252374] ------------[ cut here ]------------
>> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
>> [    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
>> [    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
>> [    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
>> [    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
>> [    6.252407] RIP: 0010:__fortify_report+0x45/0x50
>> [    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
>> [    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
>> [    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
>> [    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
>> [    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
>> [    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
>> [    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
>> [    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
>> [    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
>> [    6.252420] PKRU: 55555554
>> [    6.252421] Call Trace:
>> [    6.252423]  <TASK>
>> [    6.252425]  ? __warn+0xd5/0x1d0
>> [    6.252427]  ? __fortify_report+0x45/0x50
>> [    6.252429]  ? report_bug+0x144/0x1f0
>> [    6.252431]  ? __fortify_report+0x45/0x50
>> [    6.252433]  ? handle_bug+0x6a/0x90
>> [    6.252435]  ? exc_invalid_op+0x1a/0x50
>> [    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
>> [    6.252440]  ? __fortify_report+0x45/0x50
>> [    6.252441]  __fortify_panic+0x9/0x10
>> [    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
>> [    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
>> [    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
>> [    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
>> [    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
>> [    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
>> [    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
>> [    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
>> 
>> ...
>> 
>> 
>> The memchr in question is at:
>> https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99
>> 
>> There is not actually a buffer overflow here, I checked with gdb that
>> xattr.v->x_name does actually contain a string of the correct length and
>> xattr.v->x_name_len contains the correct length and should be used to determine
>> the length when memchr uses __struct_size for bounds-checking due to the
>> __counted_by annotation.
>> 
>> I'm at the point where I think this is probably a bug in clang. I have a patch
>> that does fix (more like bandaid) the problem and adds some print statements:
>> 
>> --
>> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
>> index 56c8d3fe55a4..8d7e749b7dda 100644
>> --- a/fs/bcachefs/xattr.c
>> +++ b/fs/bcachefs/xattr.c
>> @@ -74,6 +74,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
>>      enum bch_validate_flags flags)
>> {
>> struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
>> + const struct bch_xattr *v = (void *)k.v;
>> unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
>>  le16_to_cpu(xattr.v->x_val_len));
>> int ret = 0;
>> @@ -94,9 +95,12 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
>> 
>> bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
>> c, xattr_invalid_type,
>> - "invalid type (%u)", xattr.v->x_type);
>> + "invalid type (%u)", v->x_type);
>> 
>> - bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
>> + pr_info("x_name_len: %d", v->x_name_len);
>> + pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
>> + pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
>> + bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
>> c, xattr_name_invalid_chars,
>> "xattr name has invalid characters");
>> fsck_err:
>> --
>> 
>> 
>> Making memchr access via a pointer created with
>> const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
>> can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
>> __struct_size(v->x_name) correctly returns 10 in this case (the value of
>> x_name_len).
>> 
>> The generated assembly illustrates what is going wrong. Below is an excerpt
>> of the assembly clang generated for the bch2_xattr_validate function:
>> 
>> mov r13d, ecx
>> mov r15, rdi
>> mov r14, rsi
>> mov rdi, offset .L.str.3
>> mov rsi, offset .L__func__.bch2_xattr_validate
>> mov rbx, rdx
>> mov edx, eax
>> call _printk
>> movzx edx, byte ptr [rbx + 1]
>> mov rdi, offset .L.str.4
>> mov rsi, offset .L__func__.bch2_xattr_validate
>> call _printk
>> movzx edx, bh
>> mov rdi, offset .L.str.4
>> mov rsi, offset .L__func__.bch2_xattr_validate
>> call _printk
>> lea rdi, [rbx + 4]
>> mov r12, rbx
>> movzx edx, byte ptr [rbx + 1]
>> xor ebx, ebx
>> xor esi, esi
>> call memchr
>> 
>> At the start of this rdx contains k.v (and is moved into rbx). The three calls
>> to printk are the ones you can see in my patch. You can see that for the
>> print that uses __struct_size(v->x_name) the compiler correctly uses
>> movzx edx, byte ptr [rbx + 1]
>> to load x_name_len into edx.
>> 
>> For the printk call that uses __struct_size(xattr.v->x_name) however the
>> compiler uses
>> movzx edx, bh
>> So it will print the high 8 bits of the lower 16 bits (second least
>> significant byte) of the memory address of xattr.v->x_type. This is obviously
>> completely wrong.
>> 
>> It is then doing the correct call of memchr because this is using my patch.
>> Without my patch it would be doing the same thing for the call to memchr where
>> it uses the second least significant byte of the memory address of x_type as the
>> length used for the bounds-check.
>> 
>> 
>> 
>> The LLVM IR also shows the same problem:
>> 
>> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
>> [...]
>> %51 = ptrtoint ptr %2 to i64
>> %52 = lshr i64 %51, 8
>> %53 = and i64 %52, 255
>> 
>> This is the IR for the incorrect behavior. It simply converts the pointer to an
>> int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
>> instead of ptrtoint this would actually work, as the second least significant
>> bit of an i64 loaded from that memory address does contain the value of
>> x_name_len. It's as if clang forgot to dereference a pointer here.
>> 
>> Correct IR does this (for the other printk invocation):
>> 
>> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
>> [...]
>> %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
>> %5 = load i8, ptr %4, align 8
>> [...]
>> %48 = load i8, ptr %5, align 4
>> %49 = zext i8 %48 to i64
>> 
>> Best Regards
>> Jan
> 
> I suspect it's the same Clang __bdos() "bug" as in [1] and [2].
> 
> [1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
> [2] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/

Could you try this and see if it resolves the problem?

diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 1a957ea2f4fe..b09759f31789 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -413,7 +413,7 @@ struct ftrace_likely_data {
  * When the size of an allocated object is needed, use the best available
  * mechanism to find it. (For cases where sizeof() cannot be used.)
  */
-#if __has_builtin(__builtin_dynamic_object_size)
+#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
 #define __struct_size(p)	__builtin_dynamic_object_size(p, 0)
 #define __member_size(p)	__builtin_dynamic_object_size(p, 1)
 #else

Thanks,
Thorsten
Jan Hendrik Farr Sept. 26, 2024, 4:37 p.m. UTC | #3
On 26 18:09:57, Thorsten Blum wrote:
> On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> >> 
> >> Hi Kent,
> >> 
> >> found a strange regression in the patch set for 6.12.
> >> 
> >> First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
> >> bcachefs: Annotate struct bch_xattr with __counted_by()
> >> 
> >> When compiling with clang 18.1.8 (also with latest llvm main branch) and
> >> CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
> >> detection of a buffer overflow.
> >> 
> >> The __counted_by attribute is supposed to be supported starting with gcc 15,
> >> not sure if it is implemented yet so I haven't tested with gcc trunk yet.
> >> 
> >> Here's the relevant section of dmesg:
> >> 
> >> [    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
> >> [    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
> >> [    6.252374] ------------[ cut here ]------------
> >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> >> [    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
> >> [    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
> >> [    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
> >> [    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
> >> [    6.252407] RIP: 0010:__fortify_report+0x45/0x50
> >> [    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
> >> [    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
> >> [    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
> >> [    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
> >> [    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
> >> [    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
> >> [    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
> >> [    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
> >> [    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
> >> [    6.252420] PKRU: 55555554
> >> [    6.252421] Call Trace:
> >> [    6.252423]  <TASK>
> >> [    6.252425]  ? __warn+0xd5/0x1d0
> >> [    6.252427]  ? __fortify_report+0x45/0x50
> >> [    6.252429]  ? report_bug+0x144/0x1f0
> >> [    6.252431]  ? __fortify_report+0x45/0x50
> >> [    6.252433]  ? handle_bug+0x6a/0x90
> >> [    6.252435]  ? exc_invalid_op+0x1a/0x50
> >> [    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
> >> [    6.252440]  ? __fortify_report+0x45/0x50
> >> [    6.252441]  __fortify_panic+0x9/0x10
> >> [    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> [    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> [    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> [    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> [    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
> >> [    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
> >> [    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
> >> [    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> 
> >> ...
> >> 
> >> 
> >> The memchr in question is at:
> >> https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99
> >> 
> >> There is not actually a buffer overflow here, I checked with gdb that
> >> xattr.v->x_name does actually contain a string of the correct length and
> >> xattr.v->x_name_len contains the correct length and should be used to determine
> >> the length when memchr uses __struct_size for bounds-checking due to the
> >> __counted_by annotation.
> >> 
> >> I'm at the point where I think this is probably a bug in clang. I have a patch
> >> that does fix (more like bandaid) the problem and adds some print statements:
> >> 
> >> --
> >> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> >> index 56c8d3fe55a4..8d7e749b7dda 100644
> >> --- a/fs/bcachefs/xattr.c
> >> +++ b/fs/bcachefs/xattr.c
> >> @@ -74,6 +74,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> >>      enum bch_validate_flags flags)
> >> {
> >> struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
> >> + const struct bch_xattr *v = (void *)k.v;
> >> unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
> >>  le16_to_cpu(xattr.v->x_val_len));
> >> int ret = 0;
> >> @@ -94,9 +95,12 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> >> 
> >> bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
> >> c, xattr_invalid_type,
> >> - "invalid type (%u)", xattr.v->x_type);
> >> + "invalid type (%u)", v->x_type);
> >> 
> >> - bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> >> + pr_info("x_name_len: %d", v->x_name_len);
> >> + pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
> >> + pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
> >> + bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
> >> c, xattr_name_invalid_chars,
> >> "xattr name has invalid characters");
> >> fsck_err:
> >> --
> >> 
> >> 
> >> Making memchr access via a pointer created with
> >> const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
> >> can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
> >> __struct_size(v->x_name) correctly returns 10 in this case (the value of
> >> x_name_len).
> >> 
> >> The generated assembly illustrates what is going wrong. Below is an excerpt
> >> of the assembly clang generated for the bch2_xattr_validate function:
> >> 
> >> mov r13d, ecx
> >> mov r15, rdi
> >> mov r14, rsi
> >> mov rdi, offset .L.str.3
> >> mov rsi, offset .L__func__.bch2_xattr_validate
> >> mov rbx, rdx
> >> mov edx, eax
> >> call _printk
> >> movzx edx, byte ptr [rbx + 1]
> >> mov rdi, offset .L.str.4
> >> mov rsi, offset .L__func__.bch2_xattr_validate
> >> call _printk
> >> movzx edx, bh
> >> mov rdi, offset .L.str.4
> >> mov rsi, offset .L__func__.bch2_xattr_validate
> >> call _printk
> >> lea rdi, [rbx + 4]
> >> mov r12, rbx
> >> movzx edx, byte ptr [rbx + 1]
> >> xor ebx, ebx
> >> xor esi, esi
> >> call memchr
> >> 
> >> At the start of this rdx contains k.v (and is moved into rbx). The three calls
> >> to printk are the ones you can see in my patch. You can see that for the
> >> print that uses __struct_size(v->x_name) the compiler correctly uses
> >> movzx edx, byte ptr [rbx + 1]
> >> to load x_name_len into edx.
> >> 
> >> For the printk call that uses __struct_size(xattr.v->x_name) however the
> >> compiler uses
> >> movzx edx, bh
> >> So it will print the high 8 bits of the lower 16 bits (second least
> >> significant byte) of the memory address of xattr.v->x_type. This is obviously
> >> completely wrong.
> >> 
> >> It is then doing the correct call of memchr because this is using my patch.
> >> Without my patch it would be doing the same thing for the call to memchr where
> >> it uses the second least significant byte of the memory address of x_type as the
> >> length used for the bounds-check.
> >> 
> >> 
> >> 
> >> The LLVM IR also shows the same problem:
> >> 
> >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> >> [...]
> >> %51 = ptrtoint ptr %2 to i64
> >> %52 = lshr i64 %51, 8
> >> %53 = and i64 %52, 255
> >> 
> >> This is the IR for the incorrect behavior. It simply converts the pointer to an
> >> int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
> >> instead of ptrtoint this would actually work, as the second least significant
> >> bit of an i64 loaded from that memory address does contain the value of
> >> x_name_len. It's as if clang forgot to dereference a pointer here.
> >> 
> >> Correct IR does this (for the other printk invocation):
> >> 
> >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> >> [...]
> >> %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
> >> %5 = load i8, ptr %4, align 8
> >> [...]
> >> %48 = load i8, ptr %5, align 4
> >> %49 = zext i8 %48 to i64
> >> 
> >> Best Regards
> >> Jan
> > 
> > I suspect it's the same Clang __bdos() "bug" as in [1] and [2].
> > 
> > [1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
> > [2] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
> 
> Could you try this and see if it resolves the problem?
> 
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index 1a957ea2f4fe..b09759f31789 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -413,7 +413,7 @@ struct ftrace_likely_data {
>   * When the size of an allocated object is needed, use the best available
>   * mechanism to find it. (For cases where sizeof() cannot be used.)
>   */
> -#if __has_builtin(__builtin_dynamic_object_size)
> +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
>  #define __struct_size(p)	__builtin_dynamic_object_size(p, 0)
>  #define __member_size(p)	__builtin_dynamic_object_size(p, 1)
>  #else
> 

Weirdly enough it does not. If I print the result of __struct_size
before the call to memchr I get 0xFFFF... though, so it should work. But
in memchr it still get's 0.

I'll fire up the debugger...

> Thanks,
> Thorsten
Jan Hendrik Farr Sept. 26, 2024, 5:01 p.m. UTC | #4
On 26 18:09:57, Thorsten Blum wrote:
> On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> >> 
> >> Hi Kent,
> >> 
> >> found a strange regression in the patch set for 6.12.
> >> 
> >> First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
> >> bcachefs: Annotate struct bch_xattr with __counted_by()
> >> 
> >> When compiling with clang 18.1.8 (also with latest llvm main branch) and
> >> CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
> >> detection of a buffer overflow.
> >> 
> >> The __counted_by attribute is supposed to be supported starting with gcc 15,
> >> not sure if it is implemented yet so I haven't tested with gcc trunk yet.
> >> 
> >> Here's the relevant section of dmesg:
> >> 
> >> [    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
> >> [    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
> >> [    6.252374] ------------[ cut here ]------------
> >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> >> [    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
> >> [    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
> >> [    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
> >> [    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
> >> [    6.252407] RIP: 0010:__fortify_report+0x45/0x50
> >> [    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
> >> [    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
> >> [    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
> >> [    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
> >> [    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
> >> [    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
> >> [    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
> >> [    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
> >> [    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
> >> [    6.252420] PKRU: 55555554
> >> [    6.252421] Call Trace:
> >> [    6.252423]  <TASK>
> >> [    6.252425]  ? __warn+0xd5/0x1d0
> >> [    6.252427]  ? __fortify_report+0x45/0x50
> >> [    6.252429]  ? report_bug+0x144/0x1f0
> >> [    6.252431]  ? __fortify_report+0x45/0x50
> >> [    6.252433]  ? handle_bug+0x6a/0x90
> >> [    6.252435]  ? exc_invalid_op+0x1a/0x50
> >> [    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
> >> [    6.252440]  ? __fortify_report+0x45/0x50
> >> [    6.252441]  __fortify_panic+0x9/0x10
> >> [    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> [    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> [    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> [    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> [    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
> >> [    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
> >> [    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
> >> [    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> 
> >> ...
> >> 
> >> 
> >> The memchr in question is at:
> >> https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99
> >> 
> >> There is not actually a buffer overflow here, I checked with gdb that
> >> xattr.v->x_name does actually contain a string of the correct length and
> >> xattr.v->x_name_len contains the correct length and should be used to determine
> >> the length when memchr uses __struct_size for bounds-checking due to the
> >> __counted_by annotation.
> >> 
> >> I'm at the point where I think this is probably a bug in clang. I have a patch
> >> that does fix (more like bandaid) the problem and adds some print statements:
> >> 
> >> --
> >> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> >> index 56c8d3fe55a4..8d7e749b7dda 100644
> >> --- a/fs/bcachefs/xattr.c
> >> +++ b/fs/bcachefs/xattr.c
> >> @@ -74,6 +74,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> >>      enum bch_validate_flags flags)
> >> {
> >> struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
> >> + const struct bch_xattr *v = (void *)k.v;
> >> unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
> >>  le16_to_cpu(xattr.v->x_val_len));
> >> int ret = 0;
> >> @@ -94,9 +95,12 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> >> 
> >> bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
> >> c, xattr_invalid_type,
> >> - "invalid type (%u)", xattr.v->x_type);
> >> + "invalid type (%u)", v->x_type);
> >> 
> >> - bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> >> + pr_info("x_name_len: %d", v->x_name_len);
> >> + pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
> >> + pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
> >> + bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
> >> c, xattr_name_invalid_chars,
> >> "xattr name has invalid characters");
> >> fsck_err:
> >> --
> >> 
> >> 
> >> Making memchr access via a pointer created with
> >> const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
> >> can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
> >> __struct_size(v->x_name) correctly returns 10 in this case (the value of
> >> x_name_len).
> >> 
> >> The generated assembly illustrates what is going wrong. Below is an excerpt
> >> of the assembly clang generated for the bch2_xattr_validate function:
> >> 
> >> mov r13d, ecx
> >> mov r15, rdi
> >> mov r14, rsi
> >> mov rdi, offset .L.str.3
> >> mov rsi, offset .L__func__.bch2_xattr_validate
> >> mov rbx, rdx
> >> mov edx, eax
> >> call _printk
> >> movzx edx, byte ptr [rbx + 1]
> >> mov rdi, offset .L.str.4
> >> mov rsi, offset .L__func__.bch2_xattr_validate
> >> call _printk
> >> movzx edx, bh
> >> mov rdi, offset .L.str.4
> >> mov rsi, offset .L__func__.bch2_xattr_validate
> >> call _printk
> >> lea rdi, [rbx + 4]
> >> mov r12, rbx
> >> movzx edx, byte ptr [rbx + 1]
> >> xor ebx, ebx
> >> xor esi, esi
> >> call memchr
> >> 
> >> At the start of this rdx contains k.v (and is moved into rbx). The three calls
> >> to printk are the ones you can see in my patch. You can see that for the
> >> print that uses __struct_size(v->x_name) the compiler correctly uses
> >> movzx edx, byte ptr [rbx + 1]
> >> to load x_name_len into edx.
> >> 
> >> For the printk call that uses __struct_size(xattr.v->x_name) however the
> >> compiler uses
> >> movzx edx, bh
> >> So it will print the high 8 bits of the lower 16 bits (second least
> >> significant byte) of the memory address of xattr.v->x_type. This is obviously
> >> completely wrong.
> >> 
> >> It is then doing the correct call of memchr because this is using my patch.
> >> Without my patch it would be doing the same thing for the call to memchr where
> >> it uses the second least significant byte of the memory address of x_type as the
> >> length used for the bounds-check.
> >> 
> >> 
> >> 
> >> The LLVM IR also shows the same problem:
> >> 
> >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> >> [...]
> >> %51 = ptrtoint ptr %2 to i64
> >> %52 = lshr i64 %51, 8
> >> %53 = and i64 %52, 255
> >> 
> >> This is the IR for the incorrect behavior. It simply converts the pointer to an
> >> int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
> >> instead of ptrtoint this would actually work, as the second least significant
> >> bit of an i64 loaded from that memory address does contain the value of
> >> x_name_len. It's as if clang forgot to dereference a pointer here.
> >> 
> >> Correct IR does this (for the other printk invocation):
> >> 
> >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> >> [...]
> >> %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
> >> %5 = load i8, ptr %4, align 8
> >> [...]
> >> %48 = load i8, ptr %5, align 4
> >> %49 = zext i8 %48 to i64
> >> 
> >> Best Regards
> >> Jan
> > 
> > I suspect it's the same Clang __bdos() "bug" as in [1] and [2].
> > 
> > [1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
> > [2] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
> 
> Could you try this and see if it resolves the problem?
> 
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index 1a957ea2f4fe..b09759f31789 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -413,7 +413,7 @@ struct ftrace_likely_data {
>   * When the size of an allocated object is needed, use the best available
>   * mechanism to find it. (For cases where sizeof() cannot be used.)
>   */
> -#if __has_builtin(__builtin_dynamic_object_size)
> +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
>  #define __struct_size(p)	__builtin_dynamic_object_size(p, 0)
>  #define __member_size(p)	__builtin_dynamic_object_size(p, 1)
>  #else
> 

Alright after looking at it in the debugger the code it generates now is
just wild.

I added one more printk before the call to memchr like so:

diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
index 56c8d3fe55a4..3c7c479ea3a8 100644
--- a/fs/bcachefs/xattr.c
+++ b/fs/bcachefs/xattr.c
@@ -96,6 +96,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
 			 c, xattr_invalid_type,
 			 "invalid type (%u)", xattr.v->x_type);
 
+	pr_info("__struct_size(x_name): %lu", __struct_size(xattr.v->x_name));
 	bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
 			 c, xattr_name_invalid_chars,
 			 "xattr name has invalid characters");
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index f14c275950b5..43ac0bca485d 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -413,7 +413,7 @@ struct ftrace_likely_data {
  * When the size of an allocated object is needed, use the best available
  * mechanism to find it. (For cases where sizeof() cannot be used.)
  */
-#if __has_builtin(__builtin_dynamic_object_size)
+#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
 #define __struct_size(p)	__builtin_dynamic_object_size(p, 0)
 #define __member_size(p)	__builtin_dynamic_object_size(p, 1)
 #else


Here's the generated assembly for this:

	mov	rdi, offset .L.str.3
	mov	rsi, offset .L__func__.bch2_xattr_validate
	mov	r12, rdx
	mov	rdx, -1
	call	_printk
	mov	rax, r12
	movzx	esi, ah
	movzx	edx, byte ptr [r12 + 1]
	cmp	rsi, rdx
	jb	.LBB4_15
# %bb.11:
	lea	rdi, [rax + 4]
	xor	ebx, ebx
	xor	esi, esi
	call	memchr

So for the printk it hardcoded -1 (aka 0xFFFFF... 64 bit long int max)
as the result of __struct_size. But then for before call to memchr it does
the same stuff again and puts the second least significant byte of the memory
address of x_type in esi, only to then load the correct value of x_name_len
into edx and compares them for the bounds-check.

Best Regards
Jan
Jan Hendrik Farr Sept. 26, 2024, 5:45 p.m. UTC | #5
On 26 19:01:20, Jan Hendrik Farr wrote:
> On 26 18:09:57, Thorsten Blum wrote:
> > On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > >> 
> > >> Hi Kent,
> > >> 
> > >> found a strange regression in the patch set for 6.12.
> > >> 
> > >> First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
> > >> bcachefs: Annotate struct bch_xattr with __counted_by()
> > >> 
> > >> When compiling with clang 18.1.8 (also with latest llvm main branch) and
> > >> CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
> > >> detection of a buffer overflow.
> > >> 
> > >> The __counted_by attribute is supposed to be supported starting with gcc 15,
> > >> not sure if it is implemented yet so I haven't tested with gcc trunk yet.
> > >> 
> > >> Here's the relevant section of dmesg:
> > >> 
> > >> [    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
> > >> [    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
> > >> [    6.252374] ------------[ cut here ]------------
> > >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> > >> [    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
> > >> [    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
> > >> [    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
> > >> [    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
> > >> [    6.252407] RIP: 0010:__fortify_report+0x45/0x50
> > >> [    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
> > >> [    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
> > >> [    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
> > >> [    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
> > >> [    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
> > >> [    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
> > >> [    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
> > >> [    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
> > >> [    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >> [    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
> > >> [    6.252420] PKRU: 55555554
> > >> [    6.252421] Call Trace:
> > >> [    6.252423]  <TASK>
> > >> [    6.252425]  ? __warn+0xd5/0x1d0
> > >> [    6.252427]  ? __fortify_report+0x45/0x50
> > >> [    6.252429]  ? report_bug+0x144/0x1f0
> > >> [    6.252431]  ? __fortify_report+0x45/0x50
> > >> [    6.252433]  ? handle_bug+0x6a/0x90
> > >> [    6.252435]  ? exc_invalid_op+0x1a/0x50
> > >> [    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
> > >> [    6.252440]  ? __fortify_report+0x45/0x50
> > >> [    6.252441]  __fortify_panic+0x9/0x10
> > >> [    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> [    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> [    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> [    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> [    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
> > >> [    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> 
> > >> ...
> > >> 
> > >> 
> > >> The memchr in question is at:
> > >> https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99
> > >> 
> > >> There is not actually a buffer overflow here, I checked with gdb that
> > >> xattr.v->x_name does actually contain a string of the correct length and
> > >> xattr.v->x_name_len contains the correct length and should be used to determine
> > >> the length when memchr uses __struct_size for bounds-checking due to the
> > >> __counted_by annotation.
> > >> 
> > >> I'm at the point where I think this is probably a bug in clang. I have a patch
> > >> that does fix (more like bandaid) the problem and adds some print statements:
> > >> 
> > >> --
> > >> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> > >> index 56c8d3fe55a4..8d7e749b7dda 100644
> > >> --- a/fs/bcachefs/xattr.c
> > >> +++ b/fs/bcachefs/xattr.c
> > >> @@ -74,6 +74,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > >>      enum bch_validate_flags flags)
> > >> {
> > >> struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
> > >> + const struct bch_xattr *v = (void *)k.v;
> > >> unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
> > >>  le16_to_cpu(xattr.v->x_val_len));
> > >> int ret = 0;
> > >> @@ -94,9 +95,12 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > >> 
> > >> bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
> > >> c, xattr_invalid_type,
> > >> - "invalid type (%u)", xattr.v->x_type);
> > >> + "invalid type (%u)", v->x_type);
> > >> 
> > >> - bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > >> + pr_info("x_name_len: %d", v->x_name_len);
> > >> + pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
> > >> + pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
> > >> + bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
> > >> c, xattr_name_invalid_chars,
> > >> "xattr name has invalid characters");
> > >> fsck_err:
> > >> --
> > >> 
> > >> 
> > >> Making memchr access via a pointer created with
> > >> const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
> > >> can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
> > >> __struct_size(v->x_name) correctly returns 10 in this case (the value of
> > >> x_name_len).
> > >> 
> > >> The generated assembly illustrates what is going wrong. Below is an excerpt
> > >> of the assembly clang generated for the bch2_xattr_validate function:
> > >> 
> > >> mov r13d, ecx
> > >> mov r15, rdi
> > >> mov r14, rsi
> > >> mov rdi, offset .L.str.3
> > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > >> mov rbx, rdx
> > >> mov edx, eax
> > >> call _printk
> > >> movzx edx, byte ptr [rbx + 1]
> > >> mov rdi, offset .L.str.4
> > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > >> call _printk
> > >> movzx edx, bh
> > >> mov rdi, offset .L.str.4
> > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > >> call _printk
> > >> lea rdi, [rbx + 4]
> > >> mov r12, rbx
> > >> movzx edx, byte ptr [rbx + 1]
> > >> xor ebx, ebx
> > >> xor esi, esi
> > >> call memchr
> > >> 
> > >> At the start of this rdx contains k.v (and is moved into rbx). The three calls
> > >> to printk are the ones you can see in my patch. You can see that for the
> > >> print that uses __struct_size(v->x_name) the compiler correctly uses
> > >> movzx edx, byte ptr [rbx + 1]
> > >> to load x_name_len into edx.
> > >> 
> > >> For the printk call that uses __struct_size(xattr.v->x_name) however the
> > >> compiler uses
> > >> movzx edx, bh
> > >> So it will print the high 8 bits of the lower 16 bits (second least
> > >> significant byte) of the memory address of xattr.v->x_type. This is obviously
> > >> completely wrong.
> > >> 
> > >> It is then doing the correct call of memchr because this is using my patch.
> > >> Without my patch it would be doing the same thing for the call to memchr where
> > >> it uses the second least significant byte of the memory address of x_type as the
> > >> length used for the bounds-check.
> > >> 
> > >> 
> > >> 
> > >> The LLVM IR also shows the same problem:
> > >> 
> > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > >> [...]
> > >> %51 = ptrtoint ptr %2 to i64
> > >> %52 = lshr i64 %51, 8
> > >> %53 = and i64 %52, 255
> > >> 
> > >> This is the IR for the incorrect behavior. It simply converts the pointer to an
> > >> int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
> > >> instead of ptrtoint this would actually work, as the second least significant
> > >> bit of an i64 loaded from that memory address does contain the value of
> > >> x_name_len. It's as if clang forgot to dereference a pointer here.
> > >> 
> > >> Correct IR does this (for the other printk invocation):
> > >> 
> > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > >> [...]
> > >> %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
> > >> %5 = load i8, ptr %4, align 8
> > >> [...]
> > >> %48 = load i8, ptr %5, align 4
> > >> %49 = zext i8 %48 to i64
> > >> 
> > >> Best Regards
> > >> Jan
> > > 
> > > I suspect it's the same Clang __bdos() "bug" as in [1] and [2].
> > > 
> > > [1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
> > > [2] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
> > 
> > Could you try this and see if it resolves the problem?
> > 
> > diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> > index 1a957ea2f4fe..b09759f31789 100644
> > --- a/include/linux/compiler_types.h
> > +++ b/include/linux/compiler_types.h
> > @@ -413,7 +413,7 @@ struct ftrace_likely_data {
> >   * When the size of an allocated object is needed, use the best available
> >   * mechanism to find it. (For cases where sizeof() cannot be used.)
> >   */
> > -#if __has_builtin(__builtin_dynamic_object_size)
> > +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
> >  #define __struct_size(p)	__builtin_dynamic_object_size(p, 0)
> >  #define __member_size(p)	__builtin_dynamic_object_size(p, 1)
> >  #else
> > 
> 
> Alright after looking at it in the debugger the code it generates now is
> just wild.
> 
> I added one more printk before the call to memchr like so:
> 
> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> index 56c8d3fe55a4..3c7c479ea3a8 100644
> --- a/fs/bcachefs/xattr.c
> +++ b/fs/bcachefs/xattr.c
> @@ -96,6 +96,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
>  			 c, xattr_invalid_type,
>  			 "invalid type (%u)", xattr.v->x_type);
>  
> +	pr_info("__struct_size(x_name): %lu", __struct_size(xattr.v->x_name));
>  	bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
>  			 c, xattr_name_invalid_chars,
>  			 "xattr name has invalid characters");
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index f14c275950b5..43ac0bca485d 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -413,7 +413,7 @@ struct ftrace_likely_data {
>   * When the size of an allocated object is needed, use the best available
>   * mechanism to find it. (For cases where sizeof() cannot be used.)
>   */
> -#if __has_builtin(__builtin_dynamic_object_size)
> +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
>  #define __struct_size(p)	__builtin_dynamic_object_size(p, 0)
>  #define __member_size(p)	__builtin_dynamic_object_size(p, 1)
>  #else
> 
> 
> Here's the generated assembly for this:
> 
> 	mov	rdi, offset .L.str.3
> 	mov	rsi, offset .L__func__.bch2_xattr_validate
> 	mov	r12, rdx
> 	mov	rdx, -1
> 	call	_printk
> 	mov	rax, r12
> 	movzx	esi, ah
> 	movzx	edx, byte ptr [r12 + 1]
> 	cmp	rsi, rdx
> 	jb	.LBB4_15
> # %bb.11:
> 	lea	rdi, [rax + 4]
> 	xor	ebx, ebx
> 	xor	esi, esi
> 	call	memchr
> 
> So for the printk it hardcoded -1 (aka 0xFFFFF... 64 bit long int max)
> as the result of __struct_size. But then for before call to memchr it does
> the same stuff again and puts the second least significant byte of the memory
> address of x_type in esi, only to then load the correct value of x_name_len
> into edx and compares them for the bounds-check.
> 


__builtin_object_size should only ever be compile time known, right? So
it looks like this is pretty broken atm.

I think until this stuff is fixed in clang the only real option is:

--
diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index 32284cd26d52..bc5ee8ab4d21 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -101,7 +101,7 @@
  *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
  * clang: https://github.com/llvm/llvm-project/pull/76348
  */
-#if __has_attribute(__counted_by__)
+#if __has_attribute(__counted_by__) && !defined(__clang__)
 # define __counted_by(member)		__attribute__((__counted_by__(member)))
 #else
 # define __counted_by(member)
Ard Biesheuvel Sept. 26, 2024, 7:58 p.m. UTC | #6
(cc Kees and Bill)

On Thu, 26 Sept 2024 at 19:46, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
>
> On 26 19:01:20, Jan Hendrik Farr wrote:
> > On 26 18:09:57, Thorsten Blum wrote:
> > > On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > > > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > > >>
> > > >> Hi Kent,
> > > >>
> > > >> found a strange regression in the patch set for 6.12.
> > > >>
> > > >> First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
> > > >> bcachefs: Annotate struct bch_xattr with __counted_by()
> > > >>
> > > >> When compiling with clang 18.1.8 (also with latest llvm main branch) and
> > > >> CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
> > > >> detection of a buffer overflow.
> > > >>
> > > >> The __counted_by attribute is supposed to be supported starting with gcc 15,
> > > >> not sure if it is implemented yet so I haven't tested with gcc trunk yet.
> > > >>
> > > >> Here's the relevant section of dmesg:
> > > >>
> > > >> [    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
> > > >> [    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
> > > >> [    6.252374] ------------[ cut here ]------------
> > > >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> > > >> [    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
> > > >> [    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
> > > >> [    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
> > > >> [    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
> > > >> [    6.252407] RIP: 0010:__fortify_report+0x45/0x50
> > > >> [    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
> > > >> [    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
> > > >> [    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
> > > >> [    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
> > > >> [    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
> > > >> [    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
> > > >> [    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
> > > >> [    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
> > > >> [    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > >> [    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
> > > >> [    6.252420] PKRU: 55555554
> > > >> [    6.252421] Call Trace:
> > > >> [    6.252423]  <TASK>
> > > >> [    6.252425]  ? __warn+0xd5/0x1d0
> > > >> [    6.252427]  ? __fortify_report+0x45/0x50
> > > >> [    6.252429]  ? report_bug+0x144/0x1f0
> > > >> [    6.252431]  ? __fortify_report+0x45/0x50
> > > >> [    6.252433]  ? handle_bug+0x6a/0x90
> > > >> [    6.252435]  ? exc_invalid_op+0x1a/0x50
> > > >> [    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
> > > >> [    6.252440]  ? __fortify_report+0x45/0x50
> > > >> [    6.252441]  __fortify_panic+0x9/0x10
> > > >> [    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > >> [    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > >> [    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > >> [    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > >> [    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
> > > >> [    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
> > > >> [    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
> > > >> [    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > >>
> > > >> ...
> > > >>
> > > >>
> > > >> The memchr in question is at:
> > > >> https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99
> > > >>
> > > >> There is not actually a buffer overflow here, I checked with gdb that
> > > >> xattr.v->x_name does actually contain a string of the correct length and
> > > >> xattr.v->x_name_len contains the correct length and should be used to determine
> > > >> the length when memchr uses __struct_size for bounds-checking due to the
> > > >> __counted_by annotation.
> > > >>
> > > >> I'm at the point where I think this is probably a bug in clang. I have a patch
> > > >> that does fix (more like bandaid) the problem and adds some print statements:
> > > >>
> > > >> --
> > > >> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> > > >> index 56c8d3fe55a4..8d7e749b7dda 100644
> > > >> --- a/fs/bcachefs/xattr.c
> > > >> +++ b/fs/bcachefs/xattr.c
> > > >> @@ -74,6 +74,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > > >>      enum bch_validate_flags flags)
> > > >> {
> > > >> struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
> > > >> + const struct bch_xattr *v = (void *)k.v;
> > > >> unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
> > > >>  le16_to_cpu(xattr.v->x_val_len));
> > > >> int ret = 0;
> > > >> @@ -94,9 +95,12 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > > >>
> > > >> bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
> > > >> c, xattr_invalid_type,
> > > >> - "invalid type (%u)", xattr.v->x_type);
> > > >> + "invalid type (%u)", v->x_type);
> > > >>
> > > >> - bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > > >> + pr_info("x_name_len: %d", v->x_name_len);
> > > >> + pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
> > > >> + pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
> > > >> + bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
> > > >> c, xattr_name_invalid_chars,
> > > >> "xattr name has invalid characters");
> > > >> fsck_err:
> > > >> --
> > > >>
> > > >>
> > > >> Making memchr access via a pointer created with
> > > >> const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
> > > >> can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
> > > >> __struct_size(v->x_name) correctly returns 10 in this case (the value of
> > > >> x_name_len).
> > > >>
> > > >> The generated assembly illustrates what is going wrong. Below is an excerpt
> > > >> of the assembly clang generated for the bch2_xattr_validate function:
> > > >>
> > > >> mov r13d, ecx
> > > >> mov r15, rdi
> > > >> mov r14, rsi
> > > >> mov rdi, offset .L.str.3
> > > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > > >> mov rbx, rdx
> > > >> mov edx, eax
> > > >> call _printk
> > > >> movzx edx, byte ptr [rbx + 1]
> > > >> mov rdi, offset .L.str.4
> > > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > > >> call _printk
> > > >> movzx edx, bh
> > > >> mov rdi, offset .L.str.4
> > > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > > >> call _printk
> > > >> lea rdi, [rbx + 4]
> > > >> mov r12, rbx
> > > >> movzx edx, byte ptr [rbx + 1]
> > > >> xor ebx, ebx
> > > >> xor esi, esi
> > > >> call memchr
> > > >>
> > > >> At the start of this rdx contains k.v (and is moved into rbx). The three calls
> > > >> to printk are the ones you can see in my patch. You can see that for the
> > > >> print that uses __struct_size(v->x_name) the compiler correctly uses
> > > >> movzx edx, byte ptr [rbx + 1]
> > > >> to load x_name_len into edx.
> > > >>
> > > >> For the printk call that uses __struct_size(xattr.v->x_name) however the
> > > >> compiler uses
> > > >> movzx edx, bh
> > > >> So it will print the high 8 bits of the lower 16 bits (second least
> > > >> significant byte) of the memory address of xattr.v->x_type. This is obviously
> > > >> completely wrong.
> > > >>
> > > >> It is then doing the correct call of memchr because this is using my patch.
> > > >> Without my patch it would be doing the same thing for the call to memchr where
> > > >> it uses the second least significant byte of the memory address of x_type as the
> > > >> length used for the bounds-check.
> > > >>
> > > >>
> > > >>
> > > >> The LLVM IR also shows the same problem:
> > > >>
> > > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > > >> [...]
> > > >> %51 = ptrtoint ptr %2 to i64
> > > >> %52 = lshr i64 %51, 8
> > > >> %53 = and i64 %52, 255
> > > >>
> > > >> This is the IR for the incorrect behavior. It simply converts the pointer to an
> > > >> int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
> > > >> instead of ptrtoint this would actually work, as the second least significant
> > > >> bit of an i64 loaded from that memory address does contain the value of
> > > >> x_name_len. It's as if clang forgot to dereference a pointer here.
> > > >>
> > > >> Correct IR does this (for the other printk invocation):
> > > >>
> > > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > > >> [...]
> > > >> %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
> > > >> %5 = load i8, ptr %4, align 8
> > > >> [...]
> > > >> %48 = load i8, ptr %5, align 4
> > > >> %49 = zext i8 %48 to i64
> > > >>
> > > >> Best Regards
> > > >> Jan
> > > >
> > > > I suspect it's the same Clang __bdos() "bug" as in [1] and [2].
> > > >
> > > > [1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
> > > > [2] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
> > >
> > > Could you try this and see if it resolves the problem?
> > >
> > > diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> > > index 1a957ea2f4fe..b09759f31789 100644
> > > --- a/include/linux/compiler_types.h
> > > +++ b/include/linux/compiler_types.h
> > > @@ -413,7 +413,7 @@ struct ftrace_likely_data {
> > >   * When the size of an allocated object is needed, use the best available
> > >   * mechanism to find it. (For cases where sizeof() cannot be used.)
> > >   */
> > > -#if __has_builtin(__builtin_dynamic_object_size)
> > > +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
> > >  #define __struct_size(p)   __builtin_dynamic_object_size(p, 0)
> > >  #define __member_size(p)   __builtin_dynamic_object_size(p, 1)
> > >  #else
> > >
> >
> > Alright after looking at it in the debugger the code it generates now is
> > just wild.
> >
> > I added one more printk before the call to memchr like so:
> >
> > diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> > index 56c8d3fe55a4..3c7c479ea3a8 100644
> > --- a/fs/bcachefs/xattr.c
> > +++ b/fs/bcachefs/xattr.c
> > @@ -96,6 +96,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> >                        c, xattr_invalid_type,
> >                        "invalid type (%u)", xattr.v->x_type);
> >
> > +     pr_info("__struct_size(x_name): %lu", __struct_size(xattr.v->x_name));
> >       bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> >                        c, xattr_name_invalid_chars,
> >                        "xattr name has invalid characters");
> > diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> > index f14c275950b5..43ac0bca485d 100644
> > --- a/include/linux/compiler_types.h
> > +++ b/include/linux/compiler_types.h
> > @@ -413,7 +413,7 @@ struct ftrace_likely_data {
> >   * When the size of an allocated object is needed, use the best available
> >   * mechanism to find it. (For cases where sizeof() cannot be used.)
> >   */
> > -#if __has_builtin(__builtin_dynamic_object_size)
> > +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
> >  #define __struct_size(p)     __builtin_dynamic_object_size(p, 0)
> >  #define __member_size(p)     __builtin_dynamic_object_size(p, 1)
> >  #else
> >
> >
> > Here's the generated assembly for this:
> >
> >       mov     rdi, offset .L.str.3
> >       mov     rsi, offset .L__func__.bch2_xattr_validate
> >       mov     r12, rdx
> >       mov     rdx, -1
> >       call    _printk
> >       mov     rax, r12
> >       movzx   esi, ah
> >       movzx   edx, byte ptr [r12 + 1]
> >       cmp     rsi, rdx
> >       jb      .LBB4_15
> > # %bb.11:
> >       lea     rdi, [rax + 4]
> >       xor     ebx, ebx
> >       xor     esi, esi
> >       call    memchr
> >
> > So for the printk it hardcoded -1 (aka 0xFFFFF... 64 bit long int max)
> > as the result of __struct_size. But then for before call to memchr it does
> > the same stuff again and puts the second least significant byte of the memory
> > address of x_type in esi, only to then load the correct value of x_name_len
> > into edx and compares them for the bounds-check.
> >
>
>
> __builtin_object_size should only ever be compile time known, right? So
> it looks like this is pretty broken atm.
>
> I think until this stuff is fixed in clang the only real option is:
>
> --
> diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
> index 32284cd26d52..bc5ee8ab4d21 100644
> --- a/include/linux/compiler_attributes.h
> +++ b/include/linux/compiler_attributes.h
> @@ -101,7 +101,7 @@
>   *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
>   * clang: https://github.com/llvm/llvm-project/pull/76348
>   */
> -#if __has_attribute(__counted_by__)
> +#if __has_attribute(__counted_by__) && !defined(__clang__)
>  # define __counted_by(member)          __attribute__((__counted_by__(member)))
>  #else
>  # define __counted_by(member)
>
Bill Wendling Sept. 26, 2024, 10:18 p.m. UTC | #7
On Thu, Sep 26, 2024 at 12:58 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> (cc Kees and Bill)
>
> On Thu, 26 Sept 2024 at 19:46, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> >
> > On 26 19:01:20, Jan Hendrik Farr wrote:
> > > On 26 18:09:57, Thorsten Blum wrote:
> > > > On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > > > > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > > > >>
> > > > >> Hi Kent,
> > > > >>
> > > > >> found a strange regression in the patch set for 6.12.
> > > > >>
> > > > >> First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
> > > > >> bcachefs: Annotate struct bch_xattr with __counted_by()
> > > > >>
> > > > >> When compiling with clang 18.1.8 (also with latest llvm main branch) and
> > > > >> CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
> > > > >> detection of a buffer overflow.
> > > > >>
> > > > >> The __counted_by attribute is supposed to be supported starting with gcc 15,
> > > > >> not sure if it is implemented yet so I haven't tested with gcc trunk yet.
> > > > >>
> > > > >> Here's the relevant section of dmesg:
> > > > >>
> > > > >> [    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
> > > > >> [    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
> > > > >> [    6.252374] ------------[ cut here ]------------
> > > > >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> > > > >> [    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
> > > > >> [    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
> > > > >> [    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
> > > > >> [    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
> > > > >> [    6.252407] RIP: 0010:__fortify_report+0x45/0x50
> > > > >> [    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
> > > > >> [    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
> > > > >> [    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
> > > > >> [    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
> > > > >> [    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
> > > > >> [    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
> > > > >> [    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
> > > > >> [    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
> > > > >> [    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > >> [    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
> > > > >> [    6.252420] PKRU: 55555554
> > > > >> [    6.252421] Call Trace:
> > > > >> [    6.252423]  <TASK>
> > > > >> [    6.252425]  ? __warn+0xd5/0x1d0
> > > > >> [    6.252427]  ? __fortify_report+0x45/0x50
> > > > >> [    6.252429]  ? report_bug+0x144/0x1f0
> > > > >> [    6.252431]  ? __fortify_report+0x45/0x50
> > > > >> [    6.252433]  ? handle_bug+0x6a/0x90
> > > > >> [    6.252435]  ? exc_invalid_op+0x1a/0x50
> > > > >> [    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
> > > > >> [    6.252440]  ? __fortify_report+0x45/0x50
> > > > >> [    6.252441]  __fortify_panic+0x9/0x10
> > > > >> [    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > >> [    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > >> [    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > >> [    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > >> [    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
> > > > >> [    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
> > > > >> [    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
> > > > >> [    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > >>
> > > > >> ...
> > > > >>
> > > > >>
> > > > >> The memchr in question is at:
> > > > >> https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99
> > > > >>
> > > > >> There is not actually a buffer overflow here, I checked with gdb that
> > > > >> xattr.v->x_name does actually contain a string of the correct length and
> > > > >> xattr.v->x_name_len contains the correct length and should be used to determine
> > > > >> the length when memchr uses __struct_size for bounds-checking due to the
> > > > >> __counted_by annotation.
> > > > >>
> > > > >> I'm at the point where I think this is probably a bug in clang. I have a patch
> > > > >> that does fix (more like bandaid) the problem and adds some print statements:
> > > > >>
> > > > >> --
> > > > >> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> > > > >> index 56c8d3fe55a4..8d7e749b7dda 100644
> > > > >> --- a/fs/bcachefs/xattr.c
> > > > >> +++ b/fs/bcachefs/xattr.c
> > > > >> @@ -74,6 +74,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > > > >>      enum bch_validate_flags flags)
> > > > >> {
> > > > >> struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
> > > > >> + const struct bch_xattr *v = (void *)k.v;
> > > > >> unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
> > > > >>  le16_to_cpu(xattr.v->x_val_len));
> > > > >> int ret = 0;
> > > > >> @@ -94,9 +95,12 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > > > >>
> > > > >> bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
> > > > >> c, xattr_invalid_type,
> > > > >> - "invalid type (%u)", xattr.v->x_type);
> > > > >> + "invalid type (%u)", v->x_type);
> > > > >>
> > > > >> - bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > > > >> + pr_info("x_name_len: %d", v->x_name_len);
> > > > >> + pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
> > > > >> + pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
> > > > >> + bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
> > > > >> c, xattr_name_invalid_chars,
> > > > >> "xattr name has invalid characters");
> > > > >> fsck_err:
> > > > >> --
> > > > >>
> > > > >>
> > > > >> Making memchr access via a pointer created with
> > > > >> const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
> > > > >> can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
> > > > >> __struct_size(v->x_name) correctly returns 10 in this case (the value of
> > > > >> x_name_len).
> > > > >>
> > > > >> The generated assembly illustrates what is going wrong. Below is an excerpt
> > > > >> of the assembly clang generated for the bch2_xattr_validate function:
> > > > >>
> > > > >> mov r13d, ecx
> > > > >> mov r15, rdi
> > > > >> mov r14, rsi
> > > > >> mov rdi, offset .L.str.3
> > > > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > > > >> mov rbx, rdx
> > > > >> mov edx, eax
> > > > >> call _printk
> > > > >> movzx edx, byte ptr [rbx + 1]
> > > > >> mov rdi, offset .L.str.4
> > > > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > > > >> call _printk
> > > > >> movzx edx, bh
> > > > >> mov rdi, offset .L.str.4
> > > > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > > > >> call _printk
> > > > >> lea rdi, [rbx + 4]
> > > > >> mov r12, rbx
> > > > >> movzx edx, byte ptr [rbx + 1]
> > > > >> xor ebx, ebx
> > > > >> xor esi, esi
> > > > >> call memchr
> > > > >>
> > > > >> At the start of this rdx contains k.v (and is moved into rbx). The three calls
> > > > >> to printk are the ones you can see in my patch. You can see that for the
> > > > >> print that uses __struct_size(v->x_name) the compiler correctly uses
> > > > >> movzx edx, byte ptr [rbx + 1]
> > > > >> to load x_name_len into edx.
> > > > >>
> > > > >> For the printk call that uses __struct_size(xattr.v->x_name) however the
> > > > >> compiler uses
> > > > >> movzx edx, bh
> > > > >> So it will print the high 8 bits of the lower 16 bits (second least
> > > > >> significant byte) of the memory address of xattr.v->x_type. This is obviously
> > > > >> completely wrong.
> > > > >>
> > > > >> It is then doing the correct call of memchr because this is using my patch.
> > > > >> Without my patch it would be doing the same thing for the call to memchr where
> > > > >> it uses the second least significant byte of the memory address of x_type as the
> > > > >> length used for the bounds-check.
> > > > >>
> > > > >>
> > > > >>
> > > > >> The LLVM IR also shows the same problem:
> > > > >>
> > > > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > > > >> [...]
> > > > >> %51 = ptrtoint ptr %2 to i64
> > > > >> %52 = lshr i64 %51, 8
> > > > >> %53 = and i64 %52, 255
> > > > >>
> > > > >> This is the IR for the incorrect behavior. It simply converts the pointer to an
> > > > >> int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
> > > > >> instead of ptrtoint this would actually work, as the second least significant
> > > > >> bit of an i64 loaded from that memory address does contain the value of
> > > > >> x_name_len. It's as if clang forgot to dereference a pointer here.
> > > > >>
> > > > >> Correct IR does this (for the other printk invocation):
> > > > >>
> > > > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > > > >> [...]
> > > > >> %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
> > > > >> %5 = load i8, ptr %4, align 8
> > > > >> [...]
> > > > >> %48 = load i8, ptr %5, align 4
> > > > >> %49 = zext i8 %48 to i64
> > > > >>
> > > > >> Best Regards
> > > > >> Jan
> > > > >
> > > > > I suspect it's the same Clang __bdos() "bug" as in [1] and [2].
> > > > >
> > > > > [1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
> > > > > [2] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
> > > >
> > > > Could you try this and see if it resolves the problem?
> > > >
> > > > diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> > > > index 1a957ea2f4fe..b09759f31789 100644
> > > > --- a/include/linux/compiler_types.h
> > > > +++ b/include/linux/compiler_types.h
> > > > @@ -413,7 +413,7 @@ struct ftrace_likely_data {
> > > >   * When the size of an allocated object is needed, use the best available
> > > >   * mechanism to find it. (For cases where sizeof() cannot be used.)
> > > >   */
> > > > -#if __has_builtin(__builtin_dynamic_object_size)
> > > > +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
> > > >  #define __struct_size(p)   __builtin_dynamic_object_size(p, 0)
> > > >  #define __member_size(p)   __builtin_dynamic_object_size(p, 1)
> > > >  #else
> > > >
> > >
> > > Alright after looking at it in the debugger the code it generates now is
> > > just wild.
> > >
> > > I added one more printk before the call to memchr like so:
> > >
> > > diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> > > index 56c8d3fe55a4..3c7c479ea3a8 100644
> > > --- a/fs/bcachefs/xattr.c
> > > +++ b/fs/bcachefs/xattr.c
> > > @@ -96,6 +96,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > >                        c, xattr_invalid_type,
> > >                        "invalid type (%u)", xattr.v->x_type);
> > >
> > > +     pr_info("__struct_size(x_name): %lu", __struct_size(xattr.v->x_name));
> > >       bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > >                        c, xattr_name_invalid_chars,
> > >                        "xattr name has invalid characters");
> > > diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> > > index f14c275950b5..43ac0bca485d 100644
> > > --- a/include/linux/compiler_types.h
> > > +++ b/include/linux/compiler_types.h
> > > @@ -413,7 +413,7 @@ struct ftrace_likely_data {
> > >   * When the size of an allocated object is needed, use the best available
> > >   * mechanism to find it. (For cases where sizeof() cannot be used.)
> > >   */
> > > -#if __has_builtin(__builtin_dynamic_object_size)
> > > +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
> > >  #define __struct_size(p)     __builtin_dynamic_object_size(p, 0)
> > >  #define __member_size(p)     __builtin_dynamic_object_size(p, 1)
> > >  #else
> > >
> > >
> > > Here's the generated assembly for this:
> > >
> > >       mov     rdi, offset .L.str.3
> > >       mov     rsi, offset .L__func__.bch2_xattr_validate
> > >       mov     r12, rdx
> > >       mov     rdx, -1
> > >       call    _printk
> > >       mov     rax, r12
> > >       movzx   esi, ah
> > >       movzx   edx, byte ptr [r12 + 1]
> > >       cmp     rsi, rdx
> > >       jb      .LBB4_15
> > > # %bb.11:
> > >       lea     rdi, [rax + 4]
> > >       xor     ebx, ebx
> > >       xor     esi, esi
> > >       call    memchr
> > >
> > > So for the printk it hardcoded -1 (aka 0xFFFFF... 64 bit long int max)
> > > as the result of __struct_size. But then for before call to memchr it does
> > > the same stuff again and puts the second least significant byte of the memory
> > > address of x_type in esi, only to then load the correct value of x_name_len
> > > into edx and compares them for the bounds-check.
> > >
> >
> >
> > __builtin_object_size should only ever be compile time known, right? So
> > it looks like this is pretty broken atm.
> >
> > I think until this stuff is fixed in clang the only real option is:
> >
There seems to be an issue with how the offset to the flexible array
member is calculated internally. I'm looking into it now.

-bw
Bill Wendling Sept. 27, 2024, 1:30 a.m. UTC | #8
On Thu, Sep 26, 2024 at 3:18 PM Bill Wendling <morbo@google.com> wrote:
>
> On Thu, Sep 26, 2024 at 12:58 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > (cc Kees and Bill)
> >
> > On Thu, 26 Sept 2024 at 19:46, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > >
> > > On 26 19:01:20, Jan Hendrik Farr wrote:
> > > > On 26 18:09:57, Thorsten Blum wrote:
> > > > > On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > > > > > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > > > > >>
> > > > > >> Hi Kent,
> > > > > >>
> > > > > >> found a strange regression in the patch set for 6.12.
> > > > > >>
> > > > > >> First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
> > > > > >> bcachefs: Annotate struct bch_xattr with __counted_by()
> > > > > >>
> > > > > >> When compiling with clang 18.1.8 (also with latest llvm main branch) and
> > > > > >> CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
> > > > > >> detection of a buffer overflow.
> > > > > >>
> > > > > >> The __counted_by attribute is supposed to be supported starting with gcc 15,
> > > > > >> not sure if it is implemented yet so I haven't tested with gcc trunk yet.
> > > > > >>
> > > > > >> Here's the relevant section of dmesg:
> > > > > >>
> > > > > >> [    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
> > > > > >> [    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
> > > > > >> [    6.252374] ------------[ cut here ]------------
> > > > > >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> > > > > >> [    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
> > > > > >> [    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
> > > > > >> [    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
> > > > > >> [    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
> > > > > >> [    6.252407] RIP: 0010:__fortify_report+0x45/0x50
> > > > > >> [    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
> > > > > >> [    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
> > > > > >> [    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
> > > > > >> [    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
> > > > > >> [    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
> > > > > >> [    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
> > > > > >> [    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
> > > > > >> [    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
> > > > > >> [    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > >> [    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
> > > > > >> [    6.252420] PKRU: 55555554
> > > > > >> [    6.252421] Call Trace:
> > > > > >> [    6.252423]  <TASK>
> > > > > >> [    6.252425]  ? __warn+0xd5/0x1d0
> > > > > >> [    6.252427]  ? __fortify_report+0x45/0x50
> > > > > >> [    6.252429]  ? report_bug+0x144/0x1f0
> > > > > >> [    6.252431]  ? __fortify_report+0x45/0x50
> > > > > >> [    6.252433]  ? handle_bug+0x6a/0x90
> > > > > >> [    6.252435]  ? exc_invalid_op+0x1a/0x50
> > > > > >> [    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
> > > > > >> [    6.252440]  ? __fortify_report+0x45/0x50
> > > > > >> [    6.252441]  __fortify_panic+0x9/0x10
> > > > > >> [    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > > >> [    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > > >> [    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > > >> [    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > > >> [    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
> > > > > >> [    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
> > > > > >> [    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
> > > > > >> [    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > > >>
> > > > > >> ...
> > > > > >>
> > > > > >>
> > > > > >> The memchr in question is at:
> > > > > >> https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99
> > > > > >>
> > > > > >> There is not actually a buffer overflow here, I checked with gdb that
> > > > > >> xattr.v->x_name does actually contain a string of the correct length and
> > > > > >> xattr.v->x_name_len contains the correct length and should be used to determine
> > > > > >> the length when memchr uses __struct_size for bounds-checking due to the
> > > > > >> __counted_by annotation.
> > > > > >>
> > > > > >> I'm at the point where I think this is probably a bug in clang. I have a patch
> > > > > >> that does fix (more like bandaid) the problem and adds some print statements:
> > > > > >>
> > > > > >> --
> > > > > >> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> > > > > >> index 56c8d3fe55a4..8d7e749b7dda 100644
> > > > > >> --- a/fs/bcachefs/xattr.c
> > > > > >> +++ b/fs/bcachefs/xattr.c
> > > > > >> @@ -74,6 +74,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > > > > >>      enum bch_validate_flags flags)
> > > > > >> {
> > > > > >> struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
> > > > > >> + const struct bch_xattr *v = (void *)k.v;
> > > > > >> unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
> > > > > >>  le16_to_cpu(xattr.v->x_val_len));
> > > > > >> int ret = 0;
> > > > > >> @@ -94,9 +95,12 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > > > > >>
> > > > > >> bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
> > > > > >> c, xattr_invalid_type,
> > > > > >> - "invalid type (%u)", xattr.v->x_type);
> > > > > >> + "invalid type (%u)", v->x_type);
> > > > > >>
> > > > > >> - bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > > > > >> + pr_info("x_name_len: %d", v->x_name_len);
> > > > > >> + pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
> > > > > >> + pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
> > > > > >> + bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
> > > > > >> c, xattr_name_invalid_chars,
> > > > > >> "xattr name has invalid characters");
> > > > > >> fsck_err:
> > > > > >> --
> > > > > >>
> > > > > >>
> > > > > >> Making memchr access via a pointer created with
> > > > > >> const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
> > > > > >> can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
> > > > > >> __struct_size(v->x_name) correctly returns 10 in this case (the value of
> > > > > >> x_name_len).
> > > > > >>
> > > > > >> The generated assembly illustrates what is going wrong. Below is an excerpt
> > > > > >> of the assembly clang generated for the bch2_xattr_validate function:
> > > > > >>
> > > > > >> mov r13d, ecx
> > > > > >> mov r15, rdi
> > > > > >> mov r14, rsi
> > > > > >> mov rdi, offset .L.str.3
> > > > > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > > > > >> mov rbx, rdx
> > > > > >> mov edx, eax
> > > > > >> call _printk
> > > > > >> movzx edx, byte ptr [rbx + 1]
> > > > > >> mov rdi, offset .L.str.4
> > > > > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > > > > >> call _printk
> > > > > >> movzx edx, bh
> > > > > >> mov rdi, offset .L.str.4
> > > > > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > > > > >> call _printk
> > > > > >> lea rdi, [rbx + 4]
> > > > > >> mov r12, rbx
> > > > > >> movzx edx, byte ptr [rbx + 1]
> > > > > >> xor ebx, ebx
> > > > > >> xor esi, esi
> > > > > >> call memchr
> > > > > >>
> > > > > >> At the start of this rdx contains k.v (and is moved into rbx). The three calls
> > > > > >> to printk are the ones you can see in my patch. You can see that for the
> > > > > >> print that uses __struct_size(v->x_name) the compiler correctly uses
> > > > > >> movzx edx, byte ptr [rbx + 1]
> > > > > >> to load x_name_len into edx.
> > > > > >>
> > > > > >> For the printk call that uses __struct_size(xattr.v->x_name) however the
> > > > > >> compiler uses
> > > > > >> movzx edx, bh
> > > > > >> So it will print the high 8 bits of the lower 16 bits (second least
> > > > > >> significant byte) of the memory address of xattr.v->x_type. This is obviously
> > > > > >> completely wrong.
> > > > > >>
> > > > > >> It is then doing the correct call of memchr because this is using my patch.
> > > > > >> Without my patch it would be doing the same thing for the call to memchr where
> > > > > >> it uses the second least significant byte of the memory address of x_type as the
> > > > > >> length used for the bounds-check.
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> The LLVM IR also shows the same problem:
> > > > > >>
> > > > > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > > > > >> [...]
> > > > > >> %51 = ptrtoint ptr %2 to i64
> > > > > >> %52 = lshr i64 %51, 8
> > > > > >> %53 = and i64 %52, 255
> > > > > >>
> > > > > >> This is the IR for the incorrect behavior. It simply converts the pointer to an
> > > > > >> int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
> > > > > >> instead of ptrtoint this would actually work, as the second least significant
> > > > > >> bit of an i64 loaded from that memory address does contain the value of
> > > > > >> x_name_len. It's as if clang forgot to dereference a pointer here.
> > > > > >>
> > > > > >> Correct IR does this (for the other printk invocation):
> > > > > >>
> > > > > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > > > > >> [...]
> > > > > >> %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
> > > > > >> %5 = load i8, ptr %4, align 8
> > > > > >> [...]
> > > > > >> %48 = load i8, ptr %5, align 4
> > > > > >> %49 = zext i8 %48 to i64
> > > > > >>
> > > > > >> Best Regards
> > > > > >> Jan
> > > > > >
> > > > > > I suspect it's the same Clang __bdos() "bug" as in [1] and [2].
> > > > > >
> > > > > > [1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
> > > > > > [2] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
> > > > >
> > > > > Could you try this and see if it resolves the problem?
> > > > >
> > > > > diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> > > > > index 1a957ea2f4fe..b09759f31789 100644
> > > > > --- a/include/linux/compiler_types.h
> > > > > +++ b/include/linux/compiler_types.h
> > > > > @@ -413,7 +413,7 @@ struct ftrace_likely_data {
> > > > >   * When the size of an allocated object is needed, use the best available
> > > > >   * mechanism to find it. (For cases where sizeof() cannot be used.)
> > > > >   */
> > > > > -#if __has_builtin(__builtin_dynamic_object_size)
> > > > > +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
> > > > >  #define __struct_size(p)   __builtin_dynamic_object_size(p, 0)
> > > > >  #define __member_size(p)   __builtin_dynamic_object_size(p, 1)
> > > > >  #else
> > > > >
> > > >
> > > > Alright after looking at it in the debugger the code it generates now is
> > > > just wild.
> > > >
> > > > I added one more printk before the call to memchr like so:
> > > >
> > > > diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> > > > index 56c8d3fe55a4..3c7c479ea3a8 100644
> > > > --- a/fs/bcachefs/xattr.c
> > > > +++ b/fs/bcachefs/xattr.c
> > > > @@ -96,6 +96,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > > >                        c, xattr_invalid_type,
> > > >                        "invalid type (%u)", xattr.v->x_type);
> > > >
> > > > +     pr_info("__struct_size(x_name): %lu", __struct_size(xattr.v->x_name));
> > > >       bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > > >                        c, xattr_name_invalid_chars,
> > > >                        "xattr name has invalid characters");
> > > > diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> > > > index f14c275950b5..43ac0bca485d 100644
> > > > --- a/include/linux/compiler_types.h
> > > > +++ b/include/linux/compiler_types.h
> > > > @@ -413,7 +413,7 @@ struct ftrace_likely_data {
> > > >   * When the size of an allocated object is needed, use the best available
> > > >   * mechanism to find it. (For cases where sizeof() cannot be used.)
> > > >   */
> > > > -#if __has_builtin(__builtin_dynamic_object_size)
> > > > +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
> > > >  #define __struct_size(p)     __builtin_dynamic_object_size(p, 0)
> > > >  #define __member_size(p)     __builtin_dynamic_object_size(p, 1)
> > > >  #else
> > > >
> > > >
> > > > Here's the generated assembly for this:
> > > >
> > > >       mov     rdi, offset .L.str.3
> > > >       mov     rsi, offset .L__func__.bch2_xattr_validate
> > > >       mov     r12, rdx
> > > >       mov     rdx, -1
> > > >       call    _printk
> > > >       mov     rax, r12
> > > >       movzx   esi, ah
> > > >       movzx   edx, byte ptr [r12 + 1]
> > > >       cmp     rsi, rdx
> > > >       jb      .LBB4_15
> > > > # %bb.11:
> > > >       lea     rdi, [rax + 4]
> > > >       xor     ebx, ebx
> > > >       xor     esi, esi
> > > >       call    memchr
> > > >
> > > > So for the printk it hardcoded -1 (aka 0xFFFFF... 64 bit long int max)
> > > > as the result of __struct_size. But then for before call to memchr it does
> > > > the same stuff again and puts the second least significant byte of the memory
> > > > address of x_type in esi, only to then load the correct value of x_name_len
> > > > into edx and compares them for the bounds-check.
> > > >
> > >
> > >
> > > __builtin_object_size should only ever be compile time known, right? So
> > > it looks like this is pretty broken atm.
> > >
Right. It's __builtin_dynamic_object_size that's known during runtime.

> > > I think until this stuff is fixed in clang the only real option is:
> > >
> There seems to be an issue with how the offset to the flexible array
> member is calculated internally. I'm looking into it now.
>
What Clang's doing is calculating the size of the object with this formula:

  size_t struct_size_including_flexible_array_members =
    MAX(sizeof(struct posix_acl),
        offsetof(struct posix_acl, a_entries) +
        sizeof(struct posix_acl_entry) * count);

The various sizes and offsets are as follows:

  sizeof(struct posix_acl) == 32
  sizeof(struct posix_acl_entry) == 8

  sizeof(a_refcount) == 4 :: offset == 0
  sizeof(a_rcu) == 16 :: offset == 8
  sizeof(a_count) == 4 :: offset == 24
  offsetof(a_entries) == 28

The resulting "real" size (according to Clang) is MAX(32, 28 + 8 * 1)
== 36. I believe it's padding that results in the size of 40 for the
malloc size. Does that description jibe with what you're seeing?

(For what it's worth, I think Clang is correct here.)

-bw
Jan Hendrik Farr Sept. 27, 2024, 3:41 a.m. UTC | #9
On 26 18:30:15, Bill Wendling wrote:
> On Thu, Sep 26, 2024 at 3:18 PM Bill Wendling <morbo@google.com> wrote:
> >
> > On Thu, Sep 26, 2024 at 12:58 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > (cc Kees and Bill)
> > >
> > > On Thu, 26 Sept 2024 at 19:46, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > > >
> > > > On 26 19:01:20, Jan Hendrik Farr wrote:
> > > > > On 26 18:09:57, Thorsten Blum wrote:
> > > > > > On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > > > > > > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > > > > > >>
> > > > > > >> Hi Kent,
> > > > > > >>
> > > > > > >> found a strange regression in the patch set for 6.12.
> > > > > > >>
> > > > > > >> First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
> > > > > > >> bcachefs: Annotate struct bch_xattr with __counted_by()
> > > > > > >>
> > > > > > >> When compiling with clang 18.1.8 (also with latest llvm main branch) and
> > > > > > >> CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
> > > > > > >> detection of a buffer overflow.
> > > > > > >>
> > > > > > >> The __counted_by attribute is supposed to be supported starting with gcc 15,
> > > > > > >> not sure if it is implemented yet so I haven't tested with gcc trunk yet.
> > > > > > >>
> > > > > > >> Here's the relevant section of dmesg:
> > > > > > >>
> > > > > > >> [    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
> > > > > > >> [    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
> > > > > > >> [    6.252374] ------------[ cut here ]------------
> > > > > > >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> > > > > > >> [    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
> > > > > > >> [    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
> > > > > > >> [    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
> > > > > > >> [    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
> > > > > > >> [    6.252407] RIP: 0010:__fortify_report+0x45/0x50
> > > > > > >> [    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
> > > > > > >> [    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
> > > > > > >> [    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
> > > > > > >> [    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
> > > > > > >> [    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
> > > > > > >> [    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
> > > > > > >> [    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
> > > > > > >> [    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
> > > > > > >> [    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > > >> [    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
> > > > > > >> [    6.252420] PKRU: 55555554
> > > > > > >> [    6.252421] Call Trace:
> > > > > > >> [    6.252423]  <TASK>
> > > > > > >> [    6.252425]  ? __warn+0xd5/0x1d0
> > > > > > >> [    6.252427]  ? __fortify_report+0x45/0x50
> > > > > > >> [    6.252429]  ? report_bug+0x144/0x1f0
> > > > > > >> [    6.252431]  ? __fortify_report+0x45/0x50
> > > > > > >> [    6.252433]  ? handle_bug+0x6a/0x90
> > > > > > >> [    6.252435]  ? exc_invalid_op+0x1a/0x50
> > > > > > >> [    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
> > > > > > >> [    6.252440]  ? __fortify_report+0x45/0x50
> > > > > > >> [    6.252441]  __fortify_panic+0x9/0x10
> > > > > > >> [    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > > > >> [    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > > > >> [    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > > > >> [    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > > > >> [    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
> > > > > > >> [    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
> > > > > > >> [    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
> > > > > > >> [    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > > > > > >>
> > > > > > >> ...
> > > > > > >>
> > > > > > >>
> > > > > > >> The memchr in question is at:
> > > > > > >> https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99
> > > > > > >>
> > > > > > >> There is not actually a buffer overflow here, I checked with gdb that
> > > > > > >> xattr.v->x_name does actually contain a string of the correct length and
> > > > > > >> xattr.v->x_name_len contains the correct length and should be used to determine
> > > > > > >> the length when memchr uses __struct_size for bounds-checking due to the
> > > > > > >> __counted_by annotation.
> > > > > > >>
> > > > > > >> I'm at the point where I think this is probably a bug in clang. I have a patch
> > > > > > >> that does fix (more like bandaid) the problem and adds some print statements:
> > > > > > >>
> > > > > > >> --
> > > > > > >> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> > > > > > >> index 56c8d3fe55a4..8d7e749b7dda 100644
> > > > > > >> --- a/fs/bcachefs/xattr.c
> > > > > > >> +++ b/fs/bcachefs/xattr.c
> > > > > > >> @@ -74,6 +74,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > > > > > >>      enum bch_validate_flags flags)
> > > > > > >> {
> > > > > > >> struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
> > > > > > >> + const struct bch_xattr *v = (void *)k.v;
> > > > > > >> unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
> > > > > > >>  le16_to_cpu(xattr.v->x_val_len));
> > > > > > >> int ret = 0;
> > > > > > >> @@ -94,9 +95,12 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > > > > > >>
> > > > > > >> bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
> > > > > > >> c, xattr_invalid_type,
> > > > > > >> - "invalid type (%u)", xattr.v->x_type);
> > > > > > >> + "invalid type (%u)", v->x_type);
> > > > > > >>
> > > > > > >> - bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > > > > > >> + pr_info("x_name_len: %d", v->x_name_len);
> > > > > > >> + pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
> > > > > > >> + pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
> > > > > > >> + bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
> > > > > > >> c, xattr_name_invalid_chars,
> > > > > > >> "xattr name has invalid characters");
> > > > > > >> fsck_err:
> > > > > > >> --
> > > > > > >>
> > > > > > >>
> > > > > > >> Making memchr access via a pointer created with
> > > > > > >> const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
> > > > > > >> can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
> > > > > > >> __struct_size(v->x_name) correctly returns 10 in this case (the value of
> > > > > > >> x_name_len).
> > > > > > >>
> > > > > > >> The generated assembly illustrates what is going wrong. Below is an excerpt
> > > > > > >> of the assembly clang generated for the bch2_xattr_validate function:
> > > > > > >>
> > > > > > >> mov r13d, ecx
> > > > > > >> mov r15, rdi
> > > > > > >> mov r14, rsi
> > > > > > >> mov rdi, offset .L.str.3
> > > > > > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > > > > > >> mov rbx, rdx
> > > > > > >> mov edx, eax
> > > > > > >> call _printk
> > > > > > >> movzx edx, byte ptr [rbx + 1]
> > > > > > >> mov rdi, offset .L.str.4
> > > > > > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > > > > > >> call _printk
> > > > > > >> movzx edx, bh
> > > > > > >> mov rdi, offset .L.str.4
> > > > > > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > > > > > >> call _printk
> > > > > > >> lea rdi, [rbx + 4]
> > > > > > >> mov r12, rbx
> > > > > > >> movzx edx, byte ptr [rbx + 1]
> > > > > > >> xor ebx, ebx
> > > > > > >> xor esi, esi
> > > > > > >> call memchr
> > > > > > >>
> > > > > > >> At the start of this rdx contains k.v (and is moved into rbx). The three calls
> > > > > > >> to printk are the ones you can see in my patch. You can see that for the
> > > > > > >> print that uses __struct_size(v->x_name) the compiler correctly uses
> > > > > > >> movzx edx, byte ptr [rbx + 1]
> > > > > > >> to load x_name_len into edx.
> > > > > > >>
> > > > > > >> For the printk call that uses __struct_size(xattr.v->x_name) however the
> > > > > > >> compiler uses
> > > > > > >> movzx edx, bh
> > > > > > >> So it will print the high 8 bits of the lower 16 bits (second least
> > > > > > >> significant byte) of the memory address of xattr.v->x_type. This is obviously
> > > > > > >> completely wrong.
> > > > > > >>
> > > > > > >> It is then doing the correct call of memchr because this is using my patch.
> > > > > > >> Without my patch it would be doing the same thing for the call to memchr where
> > > > > > >> it uses the second least significant byte of the memory address of x_type as the
> > > > > > >> length used for the bounds-check.
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> The LLVM IR also shows the same problem:
> > > > > > >>
> > > > > > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > > > > > >> [...]
> > > > > > >> %51 = ptrtoint ptr %2 to i64
> > > > > > >> %52 = lshr i64 %51, 8
> > > > > > >> %53 = and i64 %52, 255
> > > > > > >>
> > > > > > >> This is the IR for the incorrect behavior. It simply converts the pointer to an
> > > > > > >> int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
> > > > > > >> instead of ptrtoint this would actually work, as the second least significant
> > > > > > >> bit of an i64 loaded from that memory address does contain the value of
> > > > > > >> x_name_len. It's as if clang forgot to dereference a pointer here.
> > > > > > >>
> > > > > > >> Correct IR does this (for the other printk invocation):
> > > > > > >>
> > > > > > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > > > > > >> [...]
> > > > > > >> %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
> > > > > > >> %5 = load i8, ptr %4, align 8
> > > > > > >> [...]
> > > > > > >> %48 = load i8, ptr %5, align 4
> > > > > > >> %49 = zext i8 %48 to i64
> > > > > > >>
> > > > > > >> Best Regards
> > > > > > >> Jan
> > > > > > >
> > > > > > > I suspect it's the same Clang __bdos() "bug" as in [1] and [2].
> > > > > > >
> > > > > > > [1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
> > > > > > > [2] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
> > > > > >
> > > > > > Could you try this and see if it resolves the problem?
> > > > > >
> > > > > > diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> > > > > > index 1a957ea2f4fe..b09759f31789 100644
> > > > > > --- a/include/linux/compiler_types.h
> > > > > > +++ b/include/linux/compiler_types.h
> > > > > > @@ -413,7 +413,7 @@ struct ftrace_likely_data {
> > > > > >   * When the size of an allocated object is needed, use the best available
> > > > > >   * mechanism to find it. (For cases where sizeof() cannot be used.)
> > > > > >   */
> > > > > > -#if __has_builtin(__builtin_dynamic_object_size)
> > > > > > +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
> > > > > >  #define __struct_size(p)   __builtin_dynamic_object_size(p, 0)
> > > > > >  #define __member_size(p)   __builtin_dynamic_object_size(p, 1)
> > > > > >  #else
> > > > > >
> > > > >
> > > > > Alright after looking at it in the debugger the code it generates now is
> > > > > just wild.
> > > > >
> > > > > I added one more printk before the call to memchr like so:
> > > > >
> > > > > diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> > > > > index 56c8d3fe55a4..3c7c479ea3a8 100644
> > > > > --- a/fs/bcachefs/xattr.c
> > > > > +++ b/fs/bcachefs/xattr.c
> > > > > @@ -96,6 +96,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > > > >                        c, xattr_invalid_type,
> > > > >                        "invalid type (%u)", xattr.v->x_type);
> > > > >
> > > > > +     pr_info("__struct_size(x_name): %lu", __struct_size(xattr.v->x_name));
> > > > >       bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > > > >                        c, xattr_name_invalid_chars,
> > > > >                        "xattr name has invalid characters");
> > > > > diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> > > > > index f14c275950b5..43ac0bca485d 100644
> > > > > --- a/include/linux/compiler_types.h
> > > > > +++ b/include/linux/compiler_types.h
> > > > > @@ -413,7 +413,7 @@ struct ftrace_likely_data {
> > > > >   * When the size of an allocated object is needed, use the best available
> > > > >   * mechanism to find it. (For cases where sizeof() cannot be used.)
> > > > >   */
> > > > > -#if __has_builtin(__builtin_dynamic_object_size)
> > > > > +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
> > > > >  #define __struct_size(p)     __builtin_dynamic_object_size(p, 0)
> > > > >  #define __member_size(p)     __builtin_dynamic_object_size(p, 1)
> > > > >  #else
> > > > >
> > > > >
> > > > > Here's the generated assembly for this:
> > > > >
> > > > >       mov     rdi, offset .L.str.3
> > > > >       mov     rsi, offset .L__func__.bch2_xattr_validate
> > > > >       mov     r12, rdx
> > > > >       mov     rdx, -1
> > > > >       call    _printk
> > > > >       mov     rax, r12
> > > > >       movzx   esi, ah
> > > > >       movzx   edx, byte ptr [r12 + 1]
> > > > >       cmp     rsi, rdx
> > > > >       jb      .LBB4_15
> > > > > # %bb.11:
> > > > >       lea     rdi, [rax + 4]
> > > > >       xor     ebx, ebx
> > > > >       xor     esi, esi
> > > > >       call    memchr
> > > > >
> > > > > So for the printk it hardcoded -1 (aka 0xFFFFF... 64 bit long int max)
> > > > > as the result of __struct_size. But then for before call to memchr it does
> > > > > the same stuff again and puts the second least significant byte of the memory
> > > > > address of x_type in esi, only to then load the correct value of x_name_len
> > > > > into edx and compares them for the bounds-check.
> > > > >
> > > >
> > > >
> > > > __builtin_object_size should only ever be compile time known, right? So
> > > > it looks like this is pretty broken atm.
> > > >
> Right. It's __builtin_dynamic_object_size that's known during runtime.

Ok, so in the above example __struct_size was modified to use
__builtin_object_size instead of __builtin_dynamic_object_size. I would
expect clang to generate code that hardcodes the result of __struct_size
in that case. Why is it looking at any runtime values / pointers at all?


But this is getting a little bit off-track, as this was the code it
generated for a possible fix / workaround. Let's stick to the original problem
(as in my first mail in this thread):

The struct in question is

struct bch_xattr {
	struct bch_val		v;
	__u8			x_type;
	__u8			x_name_len;
	__le16			x_val_len;
	__u8			x_name[] __counted_by(x_name_len);
} __packed __aligned(8);

found in fs/bcachefs/xattr.c

Assume foo_ptr is a pointer to a struct bch_xattr:

My expectation is that __builtin_dynamic_object_size(foo_ptr->x_name, 0)
(aka __struct_size(foo_ptr->x_name)) will return the value of
foo_ptr->x_name_len known at runtime. This is also what the bounds-check of
memchr appears to be expecting:

__FORTIFY_INLINE __diagnose_as(__builtin_memchr, 1, 2, 3)
void *memchr(const void * const POS0 p, int c, __kernel_size_t size)
{
	const size_t p_size = __struct_size(p);

	if (__compiletime_lessthan(p_size, size))
		__read_overflow();
	if (p_size < size)
		fortify_panic(FORTIFY_FUNC_memchr, FORTIFY_READ, p_size, size, NULL);
	return __underlying_memchr(p, c, size);
}

found in include/linux/fortify-string.h

But instead clang is generating code that behaves like this:
(((u_int64_t)foo_ptr) >> 8) & 0xFF

The assembly it generates (see first mail) is this:
	movzx edx, bh
(before this rbx contains the address of foo_ptr, edx is used to pass
the output of __struct_size into a printk)


To illustrate the point I modified the function to read like this:

int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
		       enum bch_validate_flags flags)
{
	struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
	unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
					   le16_to_cpu(xattr.v->x_val_len));
	int ret = 0;

	bkey_fsck_err_on(bkey_val_u64s(k.k) < val_u64s,
			 c, xattr_val_size_too_small,
			 "value too small (%zu < %u)",
			 bkey_val_u64s(k.k), val_u64s);

	/* XXX why +4 ? */
	val_u64s = xattr_val_u64s(xattr.v->x_name_len,
				  le16_to_cpu(xattr.v->x_val_len) + 4);

	bkey_fsck_err_on(bkey_val_u64s(k.k) > val_u64s,
			 c, xattr_val_size_too_big,
			 "value too big (%zu > %u)",
			 bkey_val_u64s(k.k), val_u64s);

	bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
			 c, xattr_invalid_type,
			 "invalid type (%u)", xattr.v->x_type);

	pr_info("x_name_len: %d", xattr.v->x_name_len);
	pr_info("__struct_size(x_name): %lu", __struct_size(xattr.v->x_name));
	pr_info("__struct_size(x_name): %llu", ((((u_int64_t)(xattr.v)) >> 8)) & 0xFF);
	bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
			 c, xattr_name_invalid_chars,
			 "xattr name has invalid characters");
fsck_err:
	return ret;
}

Here's the part of the code generated for the pr_info calls:

	mov	rdi, offset .L.str.3
	mov	rsi, offset .L__func__.bch2_xattr_validate
	mov	qword ptr [rbp - 40], rdx
	mov	edx, eax
	call	_printk                         # pr_info("x_name_len: %d", xattr.v->x_name_len);
	mov	rax, qword ptr [rbp - 40]
	movzx	ebx, ah
	mov	rdi, offset .L.str.4
	mov	rsi, offset .L__func__.bch2_xattr_validate
	mov	rdx, rbx
	call	_printk                         # pr_info("__struct_size(x_name): %lu", __struct_size(xattr.v->x_name));
	mov	rdi, offset .L.str.5
	mov	rsi, offset .L__func__.bch2_xattr_validate
	mov	rdx, rbx
	call	_printk                         # pr_info("__struct_size(x_name): %llu", ((((u_int64_t)(xattr.v)) >> 8)) & 0xFF);

You can see that the last two printks use the same value in rdx. So clang
clearly treats __struct_size(xattr.v->x_name) and ((((u_int64_t)(xattr.v)) >> 8)) & 0xFF
as equivalent. My first mail also contains the LLVM IR showing similar
behavior.

The output of this example is:
[    0.638109] bcachefs: bch2_xattr_validate() x_name_len: 10
[    0.641068] bcachefs: bch2_xattr_validate() __struct_size(x_name): 0
[    0.643178] bcachefs: bch2_xattr_validate() __struct_size(x_name): 0



> 
> > > > I think until this stuff is fixed in clang the only real option is:
> > > >
> > There seems to be an issue with how the offset to the flexible array
> > member is calculated internally. I'm looking into it now.
> >
> What Clang's doing is calculating the size of the object with this formula:
> 
>   size_t struct_size_including_flexible_array_members =
>     MAX(sizeof(struct posix_acl),
>         offsetof(struct posix_acl, a_entries) +
>         sizeof(struct posix_acl_entry) * count);
> 
> The various sizes and offsets are as follows:
> 
>   sizeof(struct posix_acl) == 32
>   sizeof(struct posix_acl_entry) == 8
> 
>   sizeof(a_refcount) == 4 :: offset == 0
>   sizeof(a_rcu) == 16 :: offset == 8
>   sizeof(a_count) == 4 :: offset == 24
>   offsetof(a_entries) == 28
> 
> The resulting "real" size (according to Clang) is MAX(32, 28 + 8 * 1)
> == 36. I believe it's padding that results in the size of 40 for the
> malloc size. Does that description jibe with what you're seeing?
> 
> (For what it's worth, I think Clang is correct here.)
> 
> -bw

This is relating to [1] I assume? Haven't looked at that one in-depth,
so I'm not 100% sure if it's the same / related or a different issue.

[1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/#t


Best Regards
Jan
Jan Hendrik Farr Sept. 28, 2024, 5:36 p.m. UTC | #10
On 26 18:09:57, Thorsten Blum wrote:
> On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> >> 
> >> Hi Kent,
> >> 
> >> found a strange regression in the patch set for 6.12.
> >> 
> >> First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
> >> bcachefs: Annotate struct bch_xattr with __counted_by()
> >> 
> >> When compiling with clang 18.1.8 (also with latest llvm main branch) and
> >> CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
> >> detection of a buffer overflow.
> >> 
> >> The __counted_by attribute is supposed to be supported starting with gcc 15,
> >> not sure if it is implemented yet so I haven't tested with gcc trunk yet.
> >> 
> >> Here's the relevant section of dmesg:
> >> 
> >> [    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
> >> [    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
> >> [    6.252374] ------------[ cut here ]------------
> >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> >> [    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
> >> [    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
> >> [    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
> >> [    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
> >> [    6.252407] RIP: 0010:__fortify_report+0x45/0x50
> >> [    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
> >> [    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
> >> [    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
> >> [    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
> >> [    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
> >> [    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
> >> [    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
> >> [    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
> >> [    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
> >> [    6.252420] PKRU: 55555554
> >> [    6.252421] Call Trace:
> >> [    6.252423]  <TASK>
> >> [    6.252425]  ? __warn+0xd5/0x1d0
> >> [    6.252427]  ? __fortify_report+0x45/0x50
> >> [    6.252429]  ? report_bug+0x144/0x1f0
> >> [    6.252431]  ? __fortify_report+0x45/0x50
> >> [    6.252433]  ? handle_bug+0x6a/0x90
> >> [    6.252435]  ? exc_invalid_op+0x1a/0x50
> >> [    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
> >> [    6.252440]  ? __fortify_report+0x45/0x50
> >> [    6.252441]  __fortify_panic+0x9/0x10
> >> [    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> [    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> [    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> [    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> [    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
> >> [    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
> >> [    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
> >> [    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> >> 
> >> ...
> >> 
> >> 
> >> The memchr in question is at:
> >> https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99
> >> 
> >> There is not actually a buffer overflow here, I checked with gdb that
> >> xattr.v->x_name does actually contain a string of the correct length and
> >> xattr.v->x_name_len contains the correct length and should be used to determine
> >> the length when memchr uses __struct_size for bounds-checking due to the
> >> __counted_by annotation.
> >> 
> >> I'm at the point where I think this is probably a bug in clang. I have a patch
> >> that does fix (more like bandaid) the problem and adds some print statements:
> >> 
> >> --
> >> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> >> index 56c8d3fe55a4..8d7e749b7dda 100644
> >> --- a/fs/bcachefs/xattr.c
> >> +++ b/fs/bcachefs/xattr.c @@ -74,6 +74,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> >>      enum bch_validate_flags flags)
> >> {
> >> struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
> >> + const struct bch_xattr *v = (void *)k.v;
> >> unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
> >>  le16_to_cpu(xattr.v->x_val_len));
> >> int ret = 0;
> >> @@ -94,9 +95,12 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> >> 
> >> bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
> >> c, xattr_invalid_type,
> >> - "invalid type (%u)", xattr.v->x_type);
> >> + "invalid type (%u)", v->x_type);
> >> 
> >> - bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> >> + pr_info("x_name_len: %d", v->x_name_len);
> >> + pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
> >> + pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
> >> + bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
> >> c, xattr_name_invalid_chars,
> >> "xattr name has invalid characters");
> >> fsck_err:
> >> --
> >> 
> >> 
> >> Making memchr access via a pointer created with
> >> const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
> >> can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
> >> __struct_size(v->x_name) correctly returns 10 in this case (the value of
> >> x_name_len).
> >> 
> >> The generated assembly illustrates what is going wrong. Below is an excerpt
> >> of the assembly clang generated for the bch2_xattr_validate function:
> >> 
> >> mov r13d, ecx
> >> mov r15, rdi
> >> mov r14, rsi
> >> mov rdi, offset .L.str.3
> >> mov rsi, offset .L__func__.bch2_xattr_validate
> >> mov rbx, rdx
> >> mov edx, eax
> >> call _printk
> >> movzx edx, byte ptr [rbx + 1]
> >> mov rdi, offset .L.str.4
> >> mov rsi, offset .L__func__.bch2_xattr_validate
> >> call _printk
> >> movzx edx, bh
> >> mov rdi, offset .L.str.4
> >> mov rsi, offset .L__func__.bch2_xattr_validate
> >> call _printk
> >> lea rdi, [rbx + 4]
> >> mov r12, rbx
> >> movzx edx, byte ptr [rbx + 1]
> >> xor ebx, ebx
> >> xor esi, esi
> >> call memchr
> >> 
> >> At the start of this rdx contains k.v (and is moved into rbx). The three calls
> >> to printk are the ones you can see in my patch. You can see that for the
> >> print that uses __struct_size(v->x_name) the compiler correctly uses
> >> movzx edx, byte ptr [rbx + 1]
> >> to load x_name_len into edx.
> >> 
> >> For the printk call that uses __struct_size(xattr.v->x_name) however the
> >> compiler uses
> >> movzx edx, bh
> >> So it will print the high 8 bits of the lower 16 bits (second least
> >> significant byte) of the memory address of xattr.v->x_type. This is obviously
> >> completely wrong.
> >> 
> >> It is then doing the correct call of memchr because this is using my patch.
> >> Without my patch it would be doing the same thing for the call to memchr where
> >> it uses the second least significant byte of the memory address of x_type as the
> >> length used for the bounds-check.
> >> 
> >> 
> >> 
> >> The LLVM IR also shows the same problem:
> >> 
> >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> >> [...]
> >> %51 = ptrtoint ptr %2 to i64
> >> %52 = lshr i64 %51, 8
> >> %53 = and i64 %52, 255
> >> 
> >> This is the IR for the incorrect behavior. It simply converts the pointer to an
> >> int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
> >> instead of ptrtoint this would actually work, as the second least significant
> >> bit of an i64 loaded from that memory address does contain the value of
> >> x_name_len. It's as if clang forgot to dereference a pointer here.
> >> 
> >> Correct IR does this (for the other printk invocation):
> >> 
> >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> >> [...]
> >> %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
> >> %5 = load i8, ptr %4, align 8
> >> [...]
> >> %48 = load i8, ptr %5, align 4
> >> %49 = zext i8 %48 to i64
> >> 
> >> Best Regards
> >> Jan
> > 
> > I suspect it's the same Clang __bdos() "bug" as in [1] and [2].
> > 
> > [1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
> > [2] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
> 
> Could you try this and see if it resolves the problem?
> 
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index 1a957ea2f4fe..b09759f31789 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -413,7 +413,7 @@ struct ftrace_likely_data {
>   * When the size of an allocated object is needed, use the best available
>   * mechanism to find it. (For cases where sizeof() cannot be used.)
>   */
> -#if __has_builtin(__builtin_dynamic_object_size)
> +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
>  #define __struct_size(p)	__builtin_dynamic_object_size(p, 0)
>  #define __member_size(p)	__builtin_dynamic_object_size(p, 1)
>  #else


Alright, figured out why this fix doesn't work. The function signature
of memchr is:

void *memchr(const void * const POS0 p, int c, __kernel_size_t size)

The POS0 is the culprit. It's defined as __pass_object_size(0), which
leads to the call to __builtin_object_size being upgraded to
__builtin_dynamic_object_size.

So to make this work the POS0 definition needs the same
!defined(__clang__) on it. There's also two more
__has_builtin(__builtin_dynamic_object_size) checks in
lib/fortify_kunit.c. But they have no impact.

Now the fix works:


--
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index f14c275950b5..43ac0bca485d 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -413,7 +413,7 @@ struct ftrace_likely_data {
  * When the size of an allocated object is needed, use the best available
  * mechanism to find it. (For cases where sizeof() cannot be used.)
  */
-#if __has_builtin(__builtin_dynamic_object_size)
+#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
 #define __struct_size(p)	__builtin_dynamic_object_size(p, 0)
 #define __member_size(p)	__builtin_dynamic_object_size(p, 1)
 #else
diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
index 0d99bf11d260..7235655d9b80 100644
--- a/include/linux/fortify-string.h
+++ b/include/linux/fortify-string.h
@@ -148,7 +148,7 @@ extern char *__underlying_strncpy(char *p, const char *q, __kernel_size_t size)
  * size, rather than struct size), but there remain some stragglers using
  * type 0 that will be converted in the future.
  */
-#if __has_builtin(__builtin_dynamic_object_size)
+#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
 #define POS			__pass_dynamic_object_size(1)
 #define POS0			__pass_dynamic_object_size(0)
 #else
Jan Hendrik Farr Sept. 28, 2024, 5:49 p.m. UTC | #11
On 28 19:36:27, Jan Hendrik Farr wrote:
> On 26 18:09:57, Thorsten Blum wrote:
> > On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > >> 
> > >> Hi Kent,
> > >> 
> > >> found a strange regression in the patch set for 6.12.
> > >> 
> > >> First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
> > >> bcachefs: Annotate struct bch_xattr with __counted_by()
> > >> 
> > >> When compiling with clang 18.1.8 (also with latest llvm main branch) and
> > >> CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
> > >> detection of a buffer overflow.
> > >> 
> > >> The __counted_by attribute is supposed to be supported starting with gcc 15,
> > >> not sure if it is implemented yet so I haven't tested with gcc trunk yet.
> > >> 
> > >> Here's the relevant section of dmesg:
> > >> 
> > >> [    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
> > >> [    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
> > >> [    6.252374] ------------[ cut here ]------------
> > >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> > >> [    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
> > >> [    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
> > >> [    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
> > >> [    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
> > >> [    6.252407] RIP: 0010:__fortify_report+0x45/0x50
> > >> [    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
> > >> [    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
> > >> [    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
> > >> [    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
> > >> [    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
> > >> [    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
> > >> [    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
> > >> [    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
> > >> [    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >> [    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
> > >> [    6.252420] PKRU: 55555554
> > >> [    6.252421] Call Trace:
> > >> [    6.252423]  <TASK>
> > >> [    6.252425]  ? __warn+0xd5/0x1d0
> > >> [    6.252427]  ? __fortify_report+0x45/0x50
> > >> [    6.252429]  ? report_bug+0x144/0x1f0
> > >> [    6.252431]  ? __fortify_report+0x45/0x50
> > >> [    6.252433]  ? handle_bug+0x6a/0x90
> > >> [    6.252435]  ? exc_invalid_op+0x1a/0x50
> > >> [    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
> > >> [    6.252440]  ? __fortify_report+0x45/0x50
> > >> [    6.252441]  __fortify_panic+0x9/0x10
> > >> [    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> [    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> [    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> [    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> [    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
> > >> [    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> 
> > >> ...
> > >> 
> > >> 
> > >> The memchr in question is at:
> > >> https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99
> > >> 
> > >> There is not actually a buffer overflow here, I checked with gdb that
> > >> xattr.v->x_name does actually contain a string of the correct length and
> > >> xattr.v->x_name_len contains the correct length and should be used to determine
> > >> the length when memchr uses __struct_size for bounds-checking due to the
> > >> __counted_by annotation.
> > >> 
> > >> I'm at the point where I think this is probably a bug in clang. I have a patch
> > >> that does fix (more like bandaid) the problem and adds some print statements:
> > >> 
> > >> --
> > >> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> > >> index 56c8d3fe55a4..8d7e749b7dda 100644
> > >> --- a/fs/bcachefs/xattr.c
> > >> +++ b/fs/bcachefs/xattr.c @@ -74,6 +74,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > >>      enum bch_validate_flags flags)
> > >> {
> > >> struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
> > >> + const struct bch_xattr *v = (void *)k.v;
> > >> unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
> > >>  le16_to_cpu(xattr.v->x_val_len));
> > >> int ret = 0;
> > >> @@ -94,9 +95,12 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > >> 
> > >> bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
> > >> c, xattr_invalid_type,
> > >> - "invalid type (%u)", xattr.v->x_type);
> > >> + "invalid type (%u)", v->x_type);
> > >> 
> > >> - bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > >> + pr_info("x_name_len: %d", v->x_name_len);
> > >> + pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
> > >> + pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
> > >> + bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
> > >> c, xattr_name_invalid_chars,
> > >> "xattr name has invalid characters");
> > >> fsck_err:
> > >> --
> > >> 
> > >> 
> > >> Making memchr access via a pointer created with
> > >> const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
> > >> can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
> > >> __struct_size(v->x_name) correctly returns 10 in this case (the value of
> > >> x_name_len).
> > >> 
> > >> The generated assembly illustrates what is going wrong. Below is an excerpt
> > >> of the assembly clang generated for the bch2_xattr_validate function:
> > >> 
> > >> mov r13d, ecx
> > >> mov r15, rdi
> > >> mov r14, rsi
> > >> mov rdi, offset .L.str.3
> > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > >> mov rbx, rdx
> > >> mov edx, eax
> > >> call _printk
> > >> movzx edx, byte ptr [rbx + 1]
> > >> mov rdi, offset .L.str.4
> > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > >> call _printk
> > >> movzx edx, bh
> > >> mov rdi, offset .L.str.4
> > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > >> call _printk
> > >> lea rdi, [rbx + 4]
> > >> mov r12, rbx
> > >> movzx edx, byte ptr [rbx + 1]
> > >> xor ebx, ebx
> > >> xor esi, esi
> > >> call memchr
> > >> 
> > >> At the start of this rdx contains k.v (and is moved into rbx). The three calls
> > >> to printk are the ones you can see in my patch. You can see that for the
> > >> print that uses __struct_size(v->x_name) the compiler correctly uses
> > >> movzx edx, byte ptr [rbx + 1]
> > >> to load x_name_len into edx.
> > >> 
> > >> For the printk call that uses __struct_size(xattr.v->x_name) however the
> > >> compiler uses
> > >> movzx edx, bh
> > >> So it will print the high 8 bits of the lower 16 bits (second least
> > >> significant byte) of the memory address of xattr.v->x_type. This is obviously
> > >> completely wrong.
> > >> 
> > >> It is then doing the correct call of memchr because this is using my patch.
> > >> Without my patch it would be doing the same thing for the call to memchr where
> > >> it uses the second least significant byte of the memory address of x_type as the
> > >> length used for the bounds-check.
> > >> 
> > >> 
> > >> 
> > >> The LLVM IR also shows the same problem:
> > >> 
> > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > >> [...]
> > >> %51 = ptrtoint ptr %2 to i64
> > >> %52 = lshr i64 %51, 8
> > >> %53 = and i64 %52, 255
> > >> 
> > >> This is the IR for the incorrect behavior. It simply converts the pointer to an
> > >> int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
> > >> instead of ptrtoint this would actually work, as the second least significant
> > >> bit of an i64 loaded from that memory address does contain the value of
> > >> x_name_len. It's as if clang forgot to dereference a pointer here.
> > >> 
> > >> Correct IR does this (for the other printk invocation):
> > >> 
> > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > >> [...]
> > >> %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
> > >> %5 = load i8, ptr %4, align 8
> > >> [...]
> > >> %48 = load i8, ptr %5, align 4
> > >> %49 = zext i8 %48 to i64
> > >> 
> > >> Best Regards
> > >> Jan
> > > 
> > > I suspect it's the same Clang __bdos() "bug" as in [1] and [2].
> > > 
> > > [1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
> > > [2] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
> > 
> > Could you try this and see if it resolves the problem?
> > 
> > diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> > index 1a957ea2f4fe..b09759f31789 100644
> > --- a/include/linux/compiler_types.h
> > +++ b/include/linux/compiler_types.h
> > @@ -413,7 +413,7 @@ struct ftrace_likely_data {
> >   * When the size of an allocated object is needed, use the best available
> >   * mechanism to find it. (For cases where sizeof() cannot be used.)
> >   */
> > -#if __has_builtin(__builtin_dynamic_object_size)
> > +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
> >  #define __struct_size(p)	__builtin_dynamic_object_size(p, 0)
> >  #define __member_size(p)	__builtin_dynamic_object_size(p, 1)
> >  #else
> 
> 
> Alright, figured out why this fix doesn't work. The function signature
> of memchr is:
> 
> void *memchr(const void * const POS0 p, int c, __kernel_size_t size)
> 
> The POS0 is the culprit. It's defined as __pass_object_size(0), which
> leads to the call to __builtin_object_size being upgraded to
> __builtin_dynamic_object_size.

Correction: POS0 is defined as __pass_dynamic_object_size(0) of course.
The below patch changes it to be defined as __pass_object_size(0).

> 
> So to make this work the POS0 definition needs the same
> !defined(__clang__) on it. There's also two more
> __has_builtin(__builtin_dynamic_object_size) checks in
> lib/fortify_kunit.c. But they have no impact.
> 
> Now the fix works:
> 
> 
> --
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index f14c275950b5..43ac0bca485d 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -413,7 +413,7 @@ struct ftrace_likely_data {
>   * When the size of an allocated object is needed, use the best available
>   * mechanism to find it. (For cases where sizeof() cannot be used.)
>   */
> -#if __has_builtin(__builtin_dynamic_object_size)
> +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
>  #define __struct_size(p)	__builtin_dynamic_object_size(p, 0)
>  #define __member_size(p)	__builtin_dynamic_object_size(p, 1)
>  #else
> diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
> index 0d99bf11d260..7235655d9b80 100644
> --- a/include/linux/fortify-string.h
> +++ b/include/linux/fortify-string.h
> @@ -148,7 +148,7 @@ extern char *__underlying_strncpy(char *p, const char *q, __kernel_size_t size)
>   * size, rather than struct size), but there remain some stragglers using
>   * type 0 that will be converted in the future.
>   */
> -#if __has_builtin(__builtin_dynamic_object_size)
> +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
>  #define POS			__pass_dynamic_object_size(1)
>  #define POS0			__pass_dynamic_object_size(0)
>  #else
Kees Cook Sept. 28, 2024, 8:34 p.m. UTC | #12
On Sat, Sep 28, 2024 at 07:36:24PM +0200, Jan Hendrik Farr wrote:
> On 26 18:09:57, Thorsten Blum wrote:
> > On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > >> 
> > >> Hi Kent,
> > >> 
> > >> found a strange regression in the patch set for 6.12.
> > >> 
> > >> First bad commit is: 86e92eeeb23741a072fe7532db663250ff2e726a
> > >> bcachefs: Annotate struct bch_xattr with __counted_by()
> > >> 
> > >> When compiling with clang 18.1.8 (also with latest llvm main branch) and
> > >> CONFIG_FORTIFY_SOURCE=y my rootfs does not mount because there is an erroneous
> > >> detection of a buffer overflow.
> > >> 
> > >> The __counted_by attribute is supposed to be supported starting with gcc 15,
> > >> not sure if it is implemented yet so I haven't tested with gcc trunk yet.
> > >> 
> > >> Here's the relevant section of dmesg:
> > >> 
> > >> [    6.248736] bcachefs (nvme1n1p2): starting version 1.12: rebalance_work_acct_fix
> > >> [    6.248744] bcachefs (nvme1n1p2): recovering from clean shutdown, journal seq 1305969
> > >> [    6.252374] ------------[ cut here ]------------
> > >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> > >> [    6.252379] WARNING: CPU: 18 PID: 511 at lib/string_helpers.c:1033 __fortify_report+0x45/0x50
> > >> [    6.252383] Modules linked in: bcachefs lz4hc_compress lz4_compress hid_generic usbhid btrfs crct10dif_pclmul libcrc32c crc32_pclmul crc32c_generic polyval_clmulni crc32c_intel polyval_generic raid6_pq ghash_clmulni_intel xor sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul nvme crypto_simd ccp xhci_pci cryptd sp5100_tco xhci_pci_renesas nvme_core nvme_auth video wmi ip6_tables ip_tables x_tables i2c_dev
> > >> [    6.252404] CPU: 18 UID: 0 PID: 511 Comm: mount Not tainted 6.11.0-10065-g6fa6588e5964 #98 d8e0beb515d91b387aa60970de7203f35ddd182c
> > >> [    6.252406] Hardware name: Micro-Star International Co., Ltd. MS-7D78/PRO B650-P WIFI (MS-7D78), BIOS 1.C0 02/06/2024
> > >> [    6.252407] RIP: 0010:__fortify_report+0x45/0x50
> > >> [    6.252409] Code: 48 8b 34 c5 30 92 21 87 40 f6 c7 01 48 c7 c0 75 1b 0a 87 48 c7 c1 e1 93 07 87 48 0f 44 c8 48 c7 c7 ef 03 10 87 e8 0b c2 9b ff <0f> 0b e9 cf 5d 9e 00 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
> > >> [    6.252410] RSP: 0018:ffffbb3d03aff350 EFLAGS: 00010246
> > >> [    6.252412] RAX: 4ce590fb7c372800 RBX: ffff98d559a400e8 RCX: 0000000000000027
> > >> [    6.252413] RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff98e43db21a08
> > >> [    6.252414] RBP: ffff98d559a400d0 R08: 0000000000001fff R09: ffff98e47ddcd000
> > >> [    6.252415] R10: 0000000000005ffd R11: 0000000000000004 R12: ffff98d559a40000
> > >> [    6.252416] R13: ffff98d54abf1320 R14: ffffbb3d03aff430 R15: 0000000000000000
> > >> [    6.252417] FS:  00007efc82117800(0000) GS:ffff98e43db00000(0000) knlGS:0000000000000000
> > >> [    6.252418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >> [    6.252419] CR2: 000055d96658ea80 CR3: 000000010a12c000 CR4: 0000000000f50ef0
> > >> [    6.252420] PKRU: 55555554
> > >> [    6.252421] Call Trace:
> > >> [    6.252423]  <TASK>
> > >> [    6.252425]  ? __warn+0xd5/0x1d0
> > >> [    6.252427]  ? __fortify_report+0x45/0x50
> > >> [    6.252429]  ? report_bug+0x144/0x1f0
> > >> [    6.252431]  ? __fortify_report+0x45/0x50
> > >> [    6.252433]  ? handle_bug+0x6a/0x90
> > >> [    6.252435]  ? exc_invalid_op+0x1a/0x50
> > >> [    6.252436]  ? asm_exc_invalid_op+0x1a/0x20
> > >> [    6.252440]  ? __fortify_report+0x45/0x50
> > >> [    6.252441]  __fortify_panic+0x9/0x10
> > >> [    6.252443]  bch2_xattr_validate+0x13b/0x140 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> [    6.252463]  bch2_btree_node_read_done+0x125a/0x17a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> [    6.252482]  btree_node_read_work+0x202/0x4a0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> [    6.252499]  bch2_btree_node_read+0xa8d/0xb20 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> [    6.252514]  ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [    6.252515]  ? pcpu_alloc_noprof+0x741/0xb50
> > >> [    6.252517]  ? srso_alias_return_thunk+0x5/0xfbef5
> > >> [    6.252519]  ? time_stats_update_one+0x75/0x1f0 [bcachefs 8361179bbfcc59e669df38aec976f02d7211a659]
> > >> 
> > >> ...
> > >> 
> > >> 
> > >> The memchr in question is at:
> > >> https://github.com/torvalds/linux/blob/11a299a7933e03c83818b431e6a1c53ad387423d/fs/bcachefs/xattr.c#L99
> > >> 
> > >> There is not actually a buffer overflow here, I checked with gdb that
> > >> xattr.v->x_name does actually contain a string of the correct length and
> > >> xattr.v->x_name_len contains the correct length and should be used to determine
> > >> the length when memchr uses __struct_size for bounds-checking due to the
> > >> __counted_by annotation.
> > >> 
> > >> I'm at the point where I think this is probably a bug in clang. I have a patch
> > >> that does fix (more like bandaid) the problem and adds some print statements:
> > >> 
> > >> --
> > >> diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
> > >> index 56c8d3fe55a4..8d7e749b7dda 100644
> > >> --- a/fs/bcachefs/xattr.c
> > >> +++ b/fs/bcachefs/xattr.c @@ -74,6 +74,7 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > >>      enum bch_validate_flags flags)
> > >> {
> > >> struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
> > >> + const struct bch_xattr *v = (void *)k.v;
> > >> unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
> > >>  le16_to_cpu(xattr.v->x_val_len));
> > >> int ret = 0;
> > >> @@ -94,9 +95,12 @@ int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
> > >> 
> > >> bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
> > >> c, xattr_invalid_type,
> > >> - "invalid type (%u)", xattr.v->x_type);
> > >> + "invalid type (%u)", v->x_type);
> > >> 
> > >> - bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > >> + pr_info("x_name_len: %d", v->x_name_len);
> > >> + pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
> > >> + pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
> > >> + bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
> > >> c, xattr_name_invalid_chars,
> > >> "xattr name has invalid characters");
> > >> fsck_err:
> > >> --
> > >> 
> > >> 
> > >> Making memchr access via a pointer created with
> > >> const struct bch_xattr *v = (void *)k.v fixes it. From the print statements I
> > >> can see that __struct_size(xattr.v->x_name) incorrectly returns 0, while
> > >> __struct_size(v->x_name) correctly returns 10 in this case (the value of
> > >> x_name_len).
> > >> 
> > >> The generated assembly illustrates what is going wrong. Below is an excerpt
> > >> of the assembly clang generated for the bch2_xattr_validate function:
> > >> 
> > >> mov r13d, ecx
> > >> mov r15, rdi
> > >> mov r14, rsi
> > >> mov rdi, offset .L.str.3
> > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > >> mov rbx, rdx
> > >> mov edx, eax
> > >> call _printk
> > >> movzx edx, byte ptr [rbx + 1]
> > >> mov rdi, offset .L.str.4
> > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > >> call _printk
> > >> movzx edx, bh
> > >> mov rdi, offset .L.str.4
> > >> mov rsi, offset .L__func__.bch2_xattr_validate
> > >> call _printk
> > >> lea rdi, [rbx + 4]
> > >> mov r12, rbx
> > >> movzx edx, byte ptr [rbx + 1]
> > >> xor ebx, ebx
> > >> xor esi, esi
> > >> call memchr
> > >> 
> > >> At the start of this rdx contains k.v (and is moved into rbx). The three calls
> > >> to printk are the ones you can see in my patch. You can see that for the
> > >> print that uses __struct_size(v->x_name) the compiler correctly uses
> > >> movzx edx, byte ptr [rbx + 1]
> > >> to load x_name_len into edx.
> > >> 
> > >> For the printk call that uses __struct_size(xattr.v->x_name) however the
> > >> compiler uses
> > >> movzx edx, bh
> > >> So it will print the high 8 bits of the lower 16 bits (second least
> > >> significant byte) of the memory address of xattr.v->x_type. This is obviously
> > >> completely wrong.
> > >> 
> > >> It is then doing the correct call of memchr because this is using my patch.
> > >> Without my patch it would be doing the same thing for the call to memchr where
> > >> it uses the second least significant byte of the memory address of x_type as the
> > >> length used for the bounds-check.
> > >> 
> > >> 
> > >> 
> > >> The LLVM IR also shows the same problem:
> > >> 
> > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > >> [...]
> > >> %51 = ptrtoint ptr %2 to i64
> > >> %52 = lshr i64 %51, 8
> > >> %53 = and i64 %52, 255
> > >> 
> > >> This is the IR for the incorrect behavior. It simply converts the pointer to an
> > >> int, shifts right by 8 bits, then and with 0xFF. If it did a load (to i64)
> > >> instead of ptrtoint this would actually work, as the second least significant
> > >> bit of an i64 loaded from that memory address does contain the value of
> > >> x_name_len. It's as if clang forgot to dereference a pointer here.
> > >> 
> > >> Correct IR does this (for the other printk invocation):
> > >> 
> > >> define internal zeroext i1 @xattr_cmp_key(ptr nocapture readnone %0, ptr %1, ptr nocapture noundef readonly %2) #0 align 16 {
> > >> [...]
> > >> %4 = getelementptr inbounds %struct.bch_xattr, ptr %1, i64 0, i32 1
> > >> %5 = load i8, ptr %4, align 8
> > >> [...]
> > >> %48 = load i8, ptr %5, align 4
> > >> %49 = zext i8 %48 to i64
> > >> 
> > >> Best Regards
> > >> Jan
> > > 
> > > I suspect it's the same Clang __bdos() "bug" as in [1] and [2].
> > > 
> > > [1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
> > > [2] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
> > 
> > Could you try this and see if it resolves the problem?
> > 
> > diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> > index 1a957ea2f4fe..b09759f31789 100644
> > --- a/include/linux/compiler_types.h
> > +++ b/include/linux/compiler_types.h
> > @@ -413,7 +413,7 @@ struct ftrace_likely_data {
> >   * When the size of an allocated object is needed, use the best available
> >   * mechanism to find it. (For cases where sizeof() cannot be used.)
> >   */
> > -#if __has_builtin(__builtin_dynamic_object_size)
> > +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
> >  #define __struct_size(p)	__builtin_dynamic_object_size(p, 0)
> >  #define __member_size(p)	__builtin_dynamic_object_size(p, 1)
> >  #else
> 
> 
> Alright, figured out why this fix doesn't work. The function signature
> of memchr is:
> 
> void *memchr(const void * const POS0 p, int c, __kernel_size_t size)
> 
> The POS0 is the culprit. It's defined as __pass_object_size(0), which
> leads to the call to __builtin_object_size being upgraded to
> __builtin_dynamic_object_size.
> 
> So to make this work the POS0 definition needs the same
> !defined(__clang__) on it. There's also two more
> __has_builtin(__builtin_dynamic_object_size) checks in
> lib/fortify_kunit.c. But they have no impact.
> 
> Now the fix works:
> 
> 
> --
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index f14c275950b5..43ac0bca485d 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -413,7 +413,7 @@ struct ftrace_likely_data {
>   * When the size of an allocated object is needed, use the best available
>   * mechanism to find it. (For cases where sizeof() cannot be used.)
>   */
> -#if __has_builtin(__builtin_dynamic_object_size)
> +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
>  #define __struct_size(p)	__builtin_dynamic_object_size(p, 0)
>  #define __member_size(p)	__builtin_dynamic_object_size(p, 1)
>  #else
> diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
> index 0d99bf11d260..7235655d9b80 100644
> --- a/include/linux/fortify-string.h
> +++ b/include/linux/fortify-string.h
> @@ -148,7 +148,7 @@ extern char *__underlying_strncpy(char *p, const char *q, __kernel_size_t size)
>   * size, rather than struct size), but there remain some stragglers using
>   * type 0 that will be converted in the future.
>   */
> -#if __has_builtin(__builtin_dynamic_object_size)
> +#if __has_builtin(__builtin_dynamic_object_size) && !defined(__clang__)
>  #define POS			__pass_dynamic_object_size(1)
>  #define POS0			__pass_dynamic_object_size(0)
>  #else

Sorry, I've been out of commission with covid. Globally disabling this
macro for clang is not the right solution (way too big a hammer).

Until Bill has a fix, we can revert commit
86e92eeeb23741a072fe7532db663250ff2e726a, as the problem is limited to
certain situations where 'counted_by' is in use.

-Kees
Kees Cook Sept. 28, 2024, 8:50 p.m. UTC | #13
On Thu, Sep 26, 2024 at 06:30:15PM -0700, Bill Wendling wrote:
> On Thu, Sep 26, 2024 at 3:18 PM Bill Wendling <morbo@google.com> wrote:
> >
> > On Thu, Sep 26, 2024 at 12:58 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > (cc Kees and Bill)
> > >
> > > On Thu, 26 Sept 2024 at 19:46, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > > >
> > > > On 26 19:01:20, Jan Hendrik Farr wrote:
> > > > > On 26 18:09:57, Thorsten Blum wrote:
> > > > > > On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > > > > > > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> [...]
> > > > > > >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> [...]
> > > > >       bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > > > >                        c, xattr_name_invalid_chars,
> > > > >                        "xattr name has invalid characters");
> [...]

The thing going wrong is that __bdos(xattr.v->x_name, 0) is returning 0.
This looks exactly like the bug I minimized here:
https://lore.kernel.org/all/202409170436.C3C6E7F7A@keescook/

Since there wasn't an LLVM open bug yet, I've created:
https://github.com/llvm/llvm-project/issues/110385

-Kees
Jan Hendrik Farr Sept. 28, 2024, 11:33 p.m. UTC | #14
On 28 13:50:12, Kees Cook wrote:
> On Thu, Sep 26, 2024 at 06:30:15PM -0700, Bill Wendling wrote:
> > On Thu, Sep 26, 2024 at 3:18 PM Bill Wendling <morbo@google.com> wrote:
> > >
> > > On Thu, Sep 26, 2024 at 12:58 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > (cc Kees and Bill)
> > > >
> > > > On Thu, 26 Sept 2024 at 19:46, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > > > >
> > > > > On 26 19:01:20, Jan Hendrik Farr wrote:
> > > > > > On 26 18:09:57, Thorsten Blum wrote:
> > > > > > > On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > > > > > > > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > [...]
> > > > > > > >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> > [...]
> > > > > >       bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > > > > >                        c, xattr_name_invalid_chars,
> > > > > >                        "xattr name has invalid characters");
> > [...]
> 
> The thing going wrong is that __bdos(xattr.v->x_name, 0) is returning 0.
> This looks exactly like the bug I minimized here:
> https://lore.kernel.org/all/202409170436.C3C6E7F7A@keescook/
> 
> Since there wasn't an LLVM open bug yet, I've created:
> https://github.com/llvm/llvm-project/issues/110385
> 
> -Kees
> 

I found a fix for the issue. Fixes both the issue in this thread as well
as your reproducer. First thought they might not actually be the same
issue, but they indeed are. Haven't tested against the issue Thorsten
linked.

Havn't run the clang tests on it yet, but it does successfully compile
my kernel and fix the issue.

I'll open a PR and give more explanation tomorrow, it's getting
pretty late over here in CEST.


Here's the patch to be applied on top of
https://github.com/llvm/llvm-project

--
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 9166db4c7412..143dd3fcfcf8 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -1164,15 +1164,15 @@ llvm::Value *CodeGenFunction::EmitLoadOfCountedByField(
     Res = EmitDeclRefLValue(DRE).getPointer(*this);
     Res = Builder.CreateAlignedLoad(ConvertType(DRE->getType()), Res,
                                     getPointerAlign(), "dre.load");
-  } else if (const MemberExpr *ME = dyn_cast<MemberExpr>(StructBase)) {
-    LValue LV = EmitMemberExpr(ME);
-    Address Addr = LV.getAddress();
-    Res = Addr.emitRawPointer(*this);
   } else if (StructBase->getType()->isPointerType()) {
     LValueBaseInfo BaseInfo;
     TBAAAccessInfo TBAAInfo;
     Address Addr = EmitPointerWithAlignment(StructBase, &BaseInfo, &TBAAInfo);
     Res = Addr.emitRawPointer(*this);
+  } else if (const MemberExpr *ME = dyn_cast<MemberExpr>(StructBase)) {
+    LValue LV = EmitMemberExpr(ME);
+    Address Addr = LV.getAddress();
+    Res = Addr.emitRawPointer(*this);
   } else {
     return nullptr;
   }
Jan Hendrik Farr Sept. 29, 2024, 7:59 p.m. UTC | #15
On 29 01:33:40, Jan Hendrik Farr wrote:
> On 28 13:50:12, Kees Cook wrote:
> > On Thu, Sep 26, 2024 at 06:30:15PM -0700, Bill Wendling wrote:
> > > On Thu, Sep 26, 2024 at 3:18 PM Bill Wendling <morbo@google.com> wrote:
> > > >
> > > > On Thu, Sep 26, 2024 at 12:58 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > (cc Kees and Bill)
> > > > >
> > > > > On Thu, 26 Sept 2024 at 19:46, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > > > > >
> > > > > > On 26 19:01:20, Jan Hendrik Farr wrote:
> > > > > > > On 26 18:09:57, Thorsten Blum wrote:
> > > > > > > > On 26. Sep 2024, at 17:28, Thorsten Blum <thorsten.blum@toblux.com> wrote:
> > > > > > > > > On 26. Sep 2024, at 17:14, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > > [...]
> > > > > > > > >> [    6.252375] memchr: detected buffer overflow: 12 byte read of buffer size 0
> > > [...]
> > > > > > >       bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
> > > > > > >                        c, xattr_name_invalid_chars,
> > > > > > >                        "xattr name has invalid characters");
> > > [...]
> > 
> > The thing going wrong is that __bdos(xattr.v->x_name, 0) is returning 0.
> > This looks exactly like the bug I minimized here:
> > https://lore.kernel.org/all/202409170436.C3C6E7F7A@keescook/
> > 
> > Since there wasn't an LLVM open bug yet, I've created:
> > https://github.com/llvm/llvm-project/issues/110385
> > 
> > -Kees
> > 
> I found a fix for the issue. Fixes both the issue in this thread as well
> as your reproducer. First thought they might not actually be the same
> issue, but they indeed are. Haven't tested against the issue Thorsten
> linked.
> 
> Havn't run the clang tests on it yet, but it does successfully compile
> my kernel and fix the issue.
> 
> I'll open a PR and give more explanation tomorrow, it's getting
> pretty late over here in CEST.
> 
> 
> Here's the patch to be applied on top of
> https://github.com/llvm/llvm-project
> 
> --
> diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
> index 9166db4c7412..143dd3fcfcf8 100644
> --- a/clang/lib/CodeGen/CGExpr.cpp
> +++ b/clang/lib/CodeGen/CGExpr.cpp
> @@ -1164,15 +1164,15 @@ llvm::Value *CodeGenFunction::EmitLoadOfCountedByField(
>      Res = EmitDeclRefLValue(DRE).getPointer(*this);
>      Res = Builder.CreateAlignedLoad(ConvertType(DRE->getType()), Res,
>                                      getPointerAlign(), "dre.load");
> -  } else if (const MemberExpr *ME = dyn_cast<MemberExpr>(StructBase)) {
> -    LValue LV = EmitMemberExpr(ME);
> -    Address Addr = LV.getAddress();
> -    Res = Addr.emitRawPointer(*this);
>    } else if (StructBase->getType()->isPointerType()) {
>      LValueBaseInfo BaseInfo;
>      TBAAAccessInfo TBAAInfo;
>      Address Addr = EmitPointerWithAlignment(StructBase, &BaseInfo, &TBAAInfo);
>      Res = Addr.emitRawPointer(*this);
> +  } else if (const MemberExpr *ME = dyn_cast<MemberExpr>(StructBase)) {
> +    LValue LV = EmitMemberExpr(ME);
> +    Address Addr = LV.getAddress();
> +    Res = Addr.emitRawPointer(*this);
>    } else {
>      return nullptr;
>    }


Here's the PR: https://github.com/llvm/llvm-project/pull/110437

I hope the way I added the CHECK tags in the test is good and that they if
they don't need manual cleanup, not familiar with the llvm test system.


Best Regards
Jan
Thorsten Blum Oct. 2, 2024, 9:18 a.m. UTC | #16
On 28. Sep 2024, at 22:34, Kees Cook <kees@kernel.org> wrote:
> [...]
> 
> Sorry, I've been out of commission with covid. Globally disabling this
> macro for clang is not the right solution (way too big a hammer).
> 
> Until Bill has a fix, we can revert commit
> 86e92eeeb23741a072fe7532db663250ff2e726a, as the problem is limited to
> certain situations where 'counted_by' is in use.

I already encountered two other related __counted_by() issues [1][2]
that are now being reverted. Would it be an option to disable it
globally, but only for Clang < v19 (where it looks like it'll be fixed)?

Otherwise adding __counted_by() might be a slippery slope for a long
time and the edge cases don't seem to be that rare anymore.

Thanks,
Thorsten

[1] https://lore.kernel.org/all/20240909162725.1805-2-thorsten.blum@toblux.com/
[2] https://lore.kernel.org/all/20240923213809.235128-2-thorsten.blum@linux.dev/
Jan Hendrik Farr Oct. 3, 2024, 11:33 a.m. UTC | #17
On 02 11:18:57, Thorsten Blum wrote:
> On 28. Sep 2024, at 22:34, Kees Cook <kees@kernel.org> wrote:
> > [...]
> > 
> > Sorry, I've been out of commission with covid. Globally disabling this
> > macro for clang is not the right solution (way too big a hammer).
> > 
> > Until Bill has a fix, we can revert commit
> > 86e92eeeb23741a072fe7532db663250ff2e726a, as the problem is limited to
> > certain situations where 'counted_by' is in use.
> 
> I already encountered two other related __counted_by() issues [1][2]
> that are now being reverted. Would it be an option to disable it
> globally, but only for Clang < v19 (where it looks like it'll be fixed)?
> 
> Otherwise adding __counted_by() might be a slippery slope for a long
> time and the edge cases don't seem to be that rare anymore.
> 
> Thanks,
> Thorsten
> 
> [1] https://lore.kernel.org/all/20240909162725.1805-2-thorsten.blum@toblux.com/
> [2] https://lore.kernel.org/all/20240923213809.235128-2-thorsten.blum@linux.dev/

This issue is now fixed on the llvm main branch:
https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a

So presumably this will go into 19.1.2, not sure what this means for
distros that ship clang 18. Will they have to be notified to backport
this?

Best Regards
Jan
Thorsten Blum Oct. 3, 2024, 1:07 p.m. UTC | #18
On 3. Oct 2024, at 13:33, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
>> [...]
> 
> This issue is now fixed on the llvm main branch:
> https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a

Thanks!

Do you know if it also fixes the different sizes here:
https://godbolt.org/z/vvK9PE1Yq

I ran out of disk space when compiling llvm :0

> So presumably this will go into 19.1.2, not sure what this means for
> distros that ship clang 18. Will they have to be notified to backport
> this?
> 
> Best Regards
> Jan
Jan Hendrik Farr Oct. 3, 2024, 1:12 p.m. UTC | #19
On 03 15:07:52, Thorsten Blum wrote:
> On 3. Oct 2024, at 13:33, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> >> [...]
> > 
> > This issue is now fixed on the llvm main branch:
> > https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a
> 
> Thanks!
> 
> Do you know if it also fixes the different sizes here:
> https://godbolt.org/z/vvK9PE1Yq

Unfortunately this still prints 36.

> 
> I ran out of disk space when compiling llvm :0
> 
> > So presumably this will go into 19.1.2, not sure what this means for
> > distros that ship clang 18. Will they have to be notified to backport
> > this?
> > 
> > Best Regards
> > Jan
Thorsten Blum Oct. 3, 2024, 3:02 p.m. UTC | #20
On 3. Oct 2024, at 15:12, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> On 03 15:07:52, Thorsten Blum wrote:
>> On 3. Oct 2024, at 13:33, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
>>>> [...]
>>> 
>>> This issue is now fixed on the llvm main branch:
>>> https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a
>> 
>> Thanks!
>> 
>> Do you know if it also fixes the different sizes here:
>> https://godbolt.org/z/vvK9PE1Yq
> 
> Unfortunately this still prints 36.

I just realized that the counted_by attribute itself causes the 4 bytes
difference. When you remove the attribute, the sizes are equal again.

>> I ran out of disk space when compiling llvm :0
>> 
>>> So presumably this will go into 19.1.2, not sure what this means for
>>> distros that ship clang 18. Will they have to be notified to backport
>>> this?
>>> 
>>> Best Regards
>>> Jan
Jan Hendrik Farr Oct. 3, 2024, 3:17 p.m. UTC | #21
On 03 15:07:52, Thorsten Blum wrote:
> On 3. Oct 2024, at 13:33, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> >> [...]
> > 
> > This issue is now fixed on the llvm main branch:
> > https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a
> 
> Thanks!
> 
> Do you know if it also fixes the different sizes here:
> https://godbolt.org/z/vvK9PE1Yq

I have a patch for clang that changes the behavior to what gcc does and
what the kernel seems to be expecting right now, you can find it below.

I'm not 100% sure what if the gcc or the clang behavior is currently
correct. However, I'm gonna argue that gcc has it correct.

gcc currently says that the __bdos of struct containing a flexible array
member is:

sizeof(<whole struct>) + sizeof(<flexible array element>) * <count>

clang however does the following:

max(sizeof(<whole struct>), offsetof(<flexible array member>) + sizeof(<flexible array element>) * <count>)


The kernel assumes the gcc behvaior in places like linux/fs/posix_acl.c:

struct posix_acl *
posix_acl_clone(const struct posix_acl *acl, gfp_t flags)
{
	struct posix_acl *clone = NULL;

	if (acl) {
		int size = sizeof(struct posix_acl) + acl->a_count *
		           sizeof(struct posix_acl_entry);
		clone = kmemdup(acl, size, flags);
		if (clone)
			refcount_set(&clone->a_refcount, 1);
	}
	return clone;
}
EXPORT_SYMBOL_GPL(posix_acl_clone);

This is the code that triggers the problem in [1]. The way I see it, this
code should work, as you also allocate struct posix_acl with the same
sizeof(struct posix_acl) + acl->a_count * sizeof(struct posix_acl_entry)
as an argument to malloc (or in-kernel equivalent).

Based on the C standard the size of that object is the size passed to
malloc. See bottom of page 348 [2].


I'll put together another PR to llvm with this fix, just need to
add/change tests.

[1] https://lore.kernel.org/linux-kernel/3D0816D1-0807-4D37-8D5F-3C55CA910FAA@linux.dev/
[2] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf


--
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index d739597de4c8..1d112aededbd 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -919,8 +919,7 @@ CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
   //   2) bdos of the whole struct, including the flexible array:
   //
   //     __builtin_dynamic_object_size(p, 1) ==
-  //        max(sizeof(struct s),
-  //            offsetof(struct s, array) + p->count * sizeof(*p->array))
+  //        sizeof(struct s) + p->count * sizeof(*p->array))
   //
   ASTContext &Ctx = getContext();
   const Expr *Base = E->IgnoreParenImpCasts();
@@ -1052,22 +1051,13 @@ CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
     // The whole struct is specificed in the __bdos.
     const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(OuterRD);
 
-    // Get the offset of the FAM.
-    llvm::Constant *FAMOffset = ConstantInt::get(ResType, Offset, IsSigned);
-    Value *OffsetAndFAMSize =
-        Builder.CreateAdd(FAMOffset, Res, "", !IsSigned, IsSigned);
 
     // Get the full size of the struct.
     llvm::Constant *SizeofStruct =
         ConstantInt::get(ResType, Layout.getSize().getQuantity(), IsSigned);
 
-    // max(sizeof(struct s),
-    //     offsetof(struct s, array) + p->count * sizeof(*p->array))
-    Res = IsSigned
-              ? Builder.CreateBinaryIntrinsic(llvm::Intrinsic::smax,
-                                              OffsetAndFAMSize, SizeofStruct)
-              : Builder.CreateBinaryIntrinsic(llvm::Intrinsic::umax,
-                                              OffsetAndFAMSize, SizeofStruct);
+    // Add full size of struct and fam size
+    Res = Builder.CreateAdd(SizeofStruct, Res, "", !IsSigned, IsSigned);
   }
 
   // A negative \p IdxInst or \p CountedByInst means that the index lands
Jan Hendrik Farr Oct. 3, 2024, 3:22 p.m. UTC | #22
On 03 17:02:07, Thorsten Blum wrote:
> On 3. Oct 2024, at 15:12, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > On 03 15:07:52, Thorsten Blum wrote:
> >> On 3. Oct 2024, at 13:33, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> >>>> [...]
> >>> 
> >>> This issue is now fixed on the llvm main branch:
> >>> https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a
> >> 
> >> Thanks!
> >> 
> >> Do you know if it also fixes the different sizes here:
> >> https://godbolt.org/z/vvK9PE1Yq
> > 
> > Unfortunately this still prints 36.
> 
> I just realized that the counted_by attribute itself causes the 4 bytes
> difference. When you remove the attribute, the sizes are equal again.

But we want these attributes to be in the kernel, so that
bounds-checking can be done in more scenarios, right?

This changes clang to print 40, right? gcc prints 40 in the example
whether the attribute is there or not.

> 
> >> I ran out of disk space when compiling llvm :0
> >> 
> >>> So presumably this will go into 19.1.2, not sure what this means for
> >>> distros that ship clang 18. Will they have to be notified to backport
> >>> this?
> >>> 
> >>> Best Regards
> >>> Jan
Thorsten Blum Oct. 3, 2024, 3:30 p.m. UTC | #23
On 3. Oct 2024, at 17:22, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> On 03 17:02:07, Thorsten Blum wrote:
>> On 3. Oct 2024, at 15:12, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
>>> On 03 15:07:52, Thorsten Blum wrote:
>>>> On 3. Oct 2024, at 13:33, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
>>>>>> [...]
>>>>> 
>>>>> This issue is now fixed on the llvm main branch:
>>>>> https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a
>>>> 
>>>> Thanks!
>>>> 
>>>> Do you know if it also fixes the different sizes here:
>>>> https://godbolt.org/z/vvK9PE1Yq
>>> 
>>> Unfortunately this still prints 36.
>> 
>> I just realized that the counted_by attribute itself causes the 4 bytes
>> difference. When you remove the attribute, the sizes are equal again.
> 
> But we want these attributes to be in the kernel, so that
> bounds-checking can be done in more scenarios, right?

Yes

> This changes clang to print 40, right? gcc prints 40 in the example
> whether the attribute is there or not.

Yes, clang prints 36 with and 40 without the attribute; gcc always 40.

>>>> I ran out of disk space when compiling llvm :0
>>>> 
>>>>> So presumably this will go into 19.1.2, not sure what this means for
>>>>> distros that ship clang 18. Will they have to be notified to backport
>>>>> this?
>>>>> 
>>>>> Best Regards
>>>>> Jan
Jan Hendrik Farr Oct. 3, 2024, 3:35 p.m. UTC | #24
On 03 17:30:28, Thorsten Blum wrote:
> On 3. Oct 2024, at 17:22, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > On 03 17:02:07, Thorsten Blum wrote:
> >> On 3. Oct 2024, at 15:12, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> >>> On 03 15:07:52, Thorsten Blum wrote:
> >>>> On 3. Oct 2024, at 13:33, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> >>>>>> [...]
> >>>>> 
> >>>>> This issue is now fixed on the llvm main branch:
> >>>>> https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a
> >>>> 
> >>>> Thanks!
> >>>> 
> >>>> Do you know if it also fixes the different sizes here:
> >>>> https://godbolt.org/z/vvK9PE1Yq

Do you already have an open issue on the llvm github? Otherwise I'll
open one and submit the PR shortly.

> >>> 
> >>> Unfortunately this still prints 36.
> >> 
> >> I just realized that the counted_by attribute itself causes the 4 bytes
> >> difference. When you remove the attribute, the sizes are equal again.
> > 
> > But we want these attributes to be in the kernel, so that
> > bounds-checking can be done in more scenarios, right?
> 
> Yes
> 
> > This changes clang to print 40, right? gcc prints 40 in the example
> > whether the attribute is there or not.
> 
> Yes, clang prints 36 with and 40 without the attribute; gcc always 40.
> 
> >>>> I ran out of disk space when compiling llvm :0
> >>>> 
> >>>>> So presumably this will go into 19.1.2, not sure what this means for
> >>>>> distros that ship clang 18. Will they have to be notified to backport
> >>>>> this?
> >>>>> 
> >>>>> Best Regards
> >>>>> Jan
Thorsten Blum Oct. 3, 2024, 3:43 p.m. UTC | #25
On 3. Oct 2024, at 17:35, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> On 03 17:30:28, Thorsten Blum wrote:
>> On 3. Oct 2024, at 17:22, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
>>> On 03 17:02:07, Thorsten Blum wrote:
>>>> On 3. Oct 2024, at 15:12, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
>>>>> On 03 15:07:52, Thorsten Blum wrote:
>>>>>> On 3. Oct 2024, at 13:33, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
>>>>>>>> [...]
>>>>>>> 
>>>>>>> This issue is now fixed on the llvm main branch:
>>>>>>> https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a
>>>>>> 
>>>>>> Thanks!
>>>>>> 
>>>>>> Do you know if it also fixes the different sizes here:
>>>>>> https://godbolt.org/z/vvK9PE1Yq
> 
> Do you already have an open issue on the llvm github? Otherwise I'll
> open one and submit the PR shortly.

No, feel free to open one. Thanks!

>>>>> 
>>>>> Unfortunately this still prints 36.
>>>> 
>>>> I just realized that the counted_by attribute itself causes the 4 bytes
>>>> difference. When you remove the attribute, the sizes are equal again.
>>> 
>>> But we want these attributes to be in the kernel, so that
>>> bounds-checking can be done in more scenarios, right?
>> 
>> Yes
>> 
>>> This changes clang to print 40, right? gcc prints 40 in the example
>>> whether the attribute is there or not.
>> 
>> Yes, clang prints 36 with and 40 without the attribute; gcc always 40.
>> 
>>>>>> I ran out of disk space when compiling llvm :0
>>>>>> 
>>>>>>> So presumably this will go into 19.1.2, not sure what this means for
>>>>>>> distros that ship clang 18. Will they have to be notified to backport
>>>>>>> this?
>>>>>>> 
>>>>>>> Best Regards
>>>>>>> Jan
Jan Hendrik Farr Oct. 3, 2024, 4:32 p.m. UTC | #26
On 03 17:43:02, Thorsten Blum wrote:
> On 3. Oct 2024, at 17:35, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > On 03 17:30:28, Thorsten Blum wrote:
> >> On 3. Oct 2024, at 17:22, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> >>> On 03 17:02:07, Thorsten Blum wrote:
> >>>> On 3. Oct 2024, at 15:12, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> >>>>> On 03 15:07:52, Thorsten Blum wrote:
> >>>>>> On 3. Oct 2024, at 13:33, Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> >>>>>>>> [...]
> >>>>>>> 
> >>>>>>> This issue is now fixed on the llvm main branch:
> >>>>>>> https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a
> >>>>>> 
> >>>>>> Thanks!
> >>>>>> 
> >>>>>> Do you know if it also fixes the different sizes here:
> >>>>>> https://godbolt.org/z/vvK9PE1Yq
> > 
> > Do you already have an open issue on the llvm github? Otherwise I'll
> > open one and submit the PR shortly.
> 
> No, feel free to open one. Thanks!

Here's the issue:
https://github.com/llvm/llvm-project/issues/111009

Here's the PR:
https://github.com/llvm/llvm-project/pull/111015

(Looks like I violated the code formatting rules somewhere, will fix)

> 
> >>>>> 
> >>>>> Unfortunately this still prints 36.
> >>>> 
> >>>> I just realized that the counted_by attribute itself causes the 4 bytes
> >>>> difference. When you remove the attribute, the sizes are equal again.
> >>> 
> >>> But we want these attributes to be in the kernel, so that
> >>> bounds-checking can be done in more scenarios, right?
> >> 
> >> Yes
> >> 
> >>> This changes clang to print 40, right? gcc prints 40 in the example
> >>> whether the attribute is there or not.
> >> 
> >> Yes, clang prints 36 with and 40 without the attribute; gcc always 40.
> >> 
> >>>>>> I ran out of disk space when compiling llvm :0
> >>>>>> 
> >>>>>>> So presumably this will go into 19.1.2, not sure what this means for
> >>>>>>> distros that ship clang 18. Will they have to be notified to backport
> >>>>>>> this?
> >>>>>>> 
> >>>>>>> Best Regards
> >>>>>>> Jan
>
Kees Cook Oct. 3, 2024, 9:23 p.m. UTC | #27
On Wed, Oct 02, 2024 at 11:18:57AM +0200, Thorsten Blum wrote:
> On 28. Sep 2024, at 22:34, Kees Cook <kees@kernel.org> wrote:
> > [...]
> > 
> > Sorry, I've been out of commission with covid. Globally disabling this
> > macro for clang is not the right solution (way too big a hammer).
> > 
> > Until Bill has a fix, we can revert commit
> > 86e92eeeb23741a072fe7532db663250ff2e726a, as the problem is limited to
> > certain situations where 'counted_by' is in use.
> 
> I already encountered two other related __counted_by() issues [1][2]
> that are now being reverted. Would it be an option to disable it
> globally, but only for Clang < v19 (where it looks like it'll be fixed)?

Yeah, once we have a solid fix (so we have a known Clang version to
target), I'll want counted_by disabled in versions prior to that.
Kees Cook Oct. 3, 2024, 9:28 p.m. UTC | #28
On Thu, Oct 03, 2024 at 05:17:08PM +0200, Jan Hendrik Farr wrote:
> gcc currently says that the __bdos of struct containing a flexible array
> member is:
> 
> sizeof(<whole struct>) + sizeof(<flexible array element>) * <count>
> 
> clang however does the following:
> 
> max(sizeof(<whole struct>), offsetof(<flexible array member>) + sizeof(<flexible array element>) * <count>)

Clang's calculation seems very wrong. I would expect it to match GCC's.
Jan Hendrik Farr Oct. 3, 2024, 9:48 p.m. UTC | #29
On 03 14:28:01, Kees Cook wrote:
> On Thu, Oct 03, 2024 at 05:17:08PM +0200, Jan Hendrik Farr wrote:
> > gcc currently says that the __bdos of struct containing a flexible array
> > member is:
> > 
> > sizeof(<whole struct>) + sizeof(<flexible array element>) * <count>
> > 
> > clang however does the following:
> > 
> > max(sizeof(<whole struct>), offsetof(<flexible array member>) + sizeof(<flexible array element>) * <count>)
> 
> Clang's calculation seems very wrong. I would expect it to match GCC's.
> 

I was on the very same train of thought, but I have since changed my
mind a bit. A struct containing a flexible array member can be allocated in
two ways:

(1):

struct posix_acl *acl = malloc(sizeof(struct posix_acl) + sizeof(struct posix_acl_entry) * 1);
acl.a_count = 1;

or (2):

struct posix_acl *acl = malloc(offsetof(struct posix_acl, a_entries) + sizeof(struct posix_acl_entry) * 1);
acl.a_count = 1;

Both are valid ways to allocate it. __bdos does not know which of these
methods was used to allocate the struct whose size it has to determine,
so it's giving the lower bound that doesn't include the (potential)
padding at the end.

So it comes down to false positives vs false negatives...
More details here:
https://github.com/llvm/llvm-project/pull/111015

Clangs current behavior would essentially force kernel code to always
assume option (2) is used. So

struct posix_acl *
posix_acl_clone(const struct posix_acl *acl, gfp_t flags)
{
	struct posix_acl *clone = NULL;

	if (acl) {
		int size = sizeof(struct posix_acl) + acl->a_count *
		           sizeof(struct posix_acl_entry);
		clone = kmemdup(acl, size, flags);
		if (clone)
			refcount_set(&clone->a_refcount, 1);
	}
	return clone;
}
EXPORT_SYMBOL_GPL(posix_acl_clone);

from linux/fs/posix_acl.c would have to turn into something like:

struct posix_acl *
posix_acl_clone(const struct posix_acl *acl, gfp_t flags)
{
	struct posix_acl *clone = NULL;

	if (acl) {
		int size = offsetof(struct posix_acl, a_entries) + acl->a_count *
		           sizeof(struct posix_acl_entry);
		clone = kmemdup(acl, size, flags);
		if (clone)
			refcount_set(&clone->a_refcount, 1);
	}
	return clone;
}
EXPORT_SYMBOL_GPL(posix_acl_clone);

Which is actually safer, because can you actually be sure this posix_acl
wasn't allocated using method (2)?


After looking at the assembly produced by gcc more, it actually looks
like it's using the allocation size if it's known in the current context
(for example if the struct was just malloced in the same function)
and otherwise returns INT_MAX for the __bdos of a struct containing a
flexible array member. It's only returning the size based on the
__counted_by attribute of you ask it for the __bdos of the flexible
array member itself.
Jan Hendrik Farr Oct. 3, 2024, 10:05 p.m. UTC | #30
On 03 14:23:20, Kees Cook wrote:
> On Wed, Oct 02, 2024 at 11:18:57AM +0200, Thorsten Blum wrote:
> > On 28. Sep 2024, at 22:34, Kees Cook <kees@kernel.org> wrote:
> > > [...]
> > > 
> > > Sorry, I've been out of commission with covid. Globally disabling this
> > > macro for clang is not the right solution (way too big a hammer).
> > > 
> > > Until Bill has a fix, we can revert commit
> > > 86e92eeeb23741a072fe7532db663250ff2e726a, as the problem is limited to
> > > certain situations where 'counted_by' is in use.
> > 
> > I already encountered two other related __counted_by() issues [1][2]
> > that are now being reverted. Would it be an option to disable it
> > globally, but only for Clang < v19 (where it looks like it'll be fixed)?
> 
> Yeah, once we have a solid fix (so we have a known Clang version to
> target), I'll want counted_by disabled in versions prior to that.
> 

Just to clarify: There are two separate issues. One was __bdos returning
0 (or sometimes other garbage). That one is fixed in main by [1] (so
will presumably be fixed in 19.1.2). The second is __bdos sometimes being off
by 4 bytes. That could be addressed by open PR [2].

[1] https://github.com/llvm/llvm-project/pull/110497
[2] https://github.com/llvm/llvm-project/pull/111015
Kees Cook Oct. 4, 2024, 5:13 p.m. UTC | #31
On Thu, Oct 03, 2024 at 11:48:18PM +0200, Jan Hendrik Farr wrote:
> On 03 14:28:01, Kees Cook wrote:
> > On Thu, Oct 03, 2024 at 05:17:08PM +0200, Jan Hendrik Farr wrote:
> > > gcc currently says that the __bdos of struct containing a flexible array
> > > member is:
> > > 
> > > sizeof(<whole struct>) + sizeof(<flexible array element>) * <count>
> > > 
> > > clang however does the following:
> > > 
> > > max(sizeof(<whole struct>), offsetof(<flexible array member>) + sizeof(<flexible array element>) * <count>)
> > 
> > Clang's calculation seems very wrong. I would expect it to match GCC's.
> > 
> 
> I was on the very same train of thought, but I have since changed my
> mind a bit. A struct containing a flexible array member can be allocated in
> two ways:
> 
> (1):
> 
> struct posix_acl *acl = malloc(sizeof(struct posix_acl) + sizeof(struct posix_acl_entry) * 1);
> acl.a_count = 1;
> 
> or (2):
> 
> struct posix_acl *acl = malloc(offsetof(struct posix_acl, a_entries) + sizeof(struct posix_acl_entry) * 1);
> acl.a_count = 1;
> 
> Both are valid ways to allocate it. __bdos does not know which of these
> methods was used to allocate the struct whose size it has to determine,
> so it's giving the lower bound that doesn't include the (potential)
> padding at the end.

I want to separate several easily confused issues. Instead of just
saying __bdos, let's clearly refer to what calculation within bdos is
being used. There are 3 choices currently:
- alloc_size attribute
- counted_by attribute
- fallback to __bos (which is similar to sizeof(), except that FAMs are 0 sized)

Additionally there are (for all intents and purposes) 2 size
determinations to be made by __bos and __bdos, via argument 2:
- containing object size (type 0) ("maximum size")
- specific object size (type 1) ("minimum size")

For example, consider:

struct posix_acl *acl = malloc(1024);
acl->a_count = 1;

what should these return:

	__bos(acl, 0)
	__bos(acl, 1)
	__bdos(acl, 0)
	__bdos(acl, 1)
	__bos(acl->a_entries, 0)
	__bos(acl->a_entries, 1)
	__bdos(acl->a_entries, 0)
	__bdos(acl->a_entries, 1)

> So it comes down to false positives vs false negatives...
> More details here:
> https://github.com/llvm/llvm-project/pull/111015
> 
> Clangs current behavior would essentially force kernel code to always
> assume option (2) is used. So
> 
> struct posix_acl *
> posix_acl_clone(const struct posix_acl *acl, gfp_t flags)
> {
> 	struct posix_acl *clone = NULL;
> 
> 	if (acl) {
> 		int size = sizeof(struct posix_acl) + acl->a_count *
> 		           sizeof(struct posix_acl_entry);
> 		clone = kmemdup(acl, size, flags);
> 		if (clone)
> 			refcount_set(&clone->a_refcount, 1);
> 	}
> 	return clone;
> }
> EXPORT_SYMBOL_GPL(posix_acl_clone);
> 
> from linux/fs/posix_acl.c would have to turn into something like:
> 
> struct posix_acl *
> posix_acl_clone(const struct posix_acl *acl, gfp_t flags)
> {
> 	struct posix_acl *clone = NULL;
> 
> 	if (acl) {
> 		int size = offsetof(struct posix_acl, a_entries) + acl->a_count *
> 		           sizeof(struct posix_acl_entry);
> 		clone = kmemdup(acl, size, flags);
> 		if (clone)
> 			refcount_set(&clone->a_refcount, 1);
> 	}
> 	return clone;
> }
> EXPORT_SYMBOL_GPL(posix_acl_clone);
> 
> Which is actually safer, because can you actually be sure this posix_acl
> wasn't allocated using method (2)?

First, this should not be using an open coded calculation at all; it
should use the struct_size() macro.

Secondly, if we want to change struct_size(), then we must (via
allmodconfig builds) determine all the places in the kernel
where the calculated size changes, and audit those for safety.

Right now, struct_size() over-estimates in the face of padding.

We're already moving the kernel toward not even calling struct_size()
externally from the allocation, and instead using the it within the
allocation macros themselves:
https://lore.kernel.org/lkml/20240822231324.make.666-kees@kernel.org/

> After looking at the assembly produced by gcc more, it actually looks
> like it's using the allocation size if it's known in the current context
> (for example if the struct was just malloced in the same function)
> and otherwise returns INT_MAX for the __bdos of a struct containing a
> flexible array member. It's only returning the size based on the
> __counted_by attribute of you ask it for the __bdos of the flexible
> array member itself.

Here is my test case for all the corner cases we've found so far:
https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c

I'd prefer we add cases there so we can all be talking about the same
things. :)

-Kees
Jan Hendrik Farr Oct. 7, 2024, 3:56 a.m. UTC | #32
> I want to separate several easily confused issues. Instead of just
> saying __bdos, let's clearly refer to what calculation within bdos is
> being used. There are 3 choices currently:
> - alloc_size attribute
> - counted_by attribute
> - fallback to __bos (which is similar to sizeof(), except that FAMs are 0 sized)
> 
> Additionally there are (for all intents and purposes) 2 size
> determinations to be made by __bos and __bdos, via argument 2:
> - containing object size (type 0) ("maximum size")
> - specific object size (type 1) ("minimum size")

"maximum" vs "minimum" size would by type 0 vs type 2, but I think you
do mean type 0 and type 1 as those are the types currently used by
__struct_size and __member_size. Those are both "maximum" sizes.

> 
> For example, consider:
> 
> struct posix_acl *acl = malloc(1024);
> acl->a_count = 1;
> 
> what should these return:
> 
> 	__bos(acl, 0)
> 	__bos(acl, 1)
> 	__bdos(acl, 0)
> 	__bdos(acl, 1)
> 	__bos(acl->a_entries, 0)
> 	__bos(acl->a_entries, 1)
> 	__bdos(acl->a_entries, 0)
> 	__bdos(acl->a_entries, 1)
> 

I gathered some data from clang and gcc on all for all these cases and
additionally varied whether the allocation size is a compile time known
constant, runtime known, or not known. I also varied whether
__counted_by was used.

Source code: [1]


Abbreviations:

FAM      = flexible array member
-1       = SIZE_MAX
p->a_ent = p->a_entries
comp.    = allocation size is compile time known
run.     = allocation size is compile time known
none     = allocation size is unknown
count    = __counted_by attribute in use
correct  = What I think the correct answers should be. In some places I
have two answers. In that case the second number is what the kernel
currently expects.


And here's the data:

function        |comp.|run.|none|count| gcc  |clang |correct
----------------|-----|----|----|-----|------|------|-----
bos(p, 0)       |  x  |    |    |     | 1024 | 1024 | 1024
bos(p, 0)       |     | x  |    |     |  -1  |  -1  | -1
bos(p, 0)       |     |    | x  |     |  -1  |  -1  | -1
bos(p, 0)       |  x  |    |    |  x  | 1024 | 1024 | 1024
bos(p, 0)       |     | x  |    |  x  |  -1  |  -1  | -1
bos(p, 0)       |     |    | x  |  x  |  -1  |  -1  | -1
----------------|-----|----|----|-----|------|------|-----
bos(p, 1)       |  x  |    |    |     | 1024 | 1024 | 1024
bos(p, 1)       |     | x  |    |     |  -1  |  -1  | -1
bos(p, 1)       |     |    | x  |     |  -1  |  -1  | -1
bos(p, 1)       |  x  |    |    |  x  | 1024 | 1024 | 1024
bos(p, 1)       |     | x  |    |  x  |  -1  |  -1  | -1
bos(p, 1)       |     |    | x  |  x  |  -1  |  -1  | -1
----------------|-----|----|----|-----|------|------|-----
bdos(p, 0)      |  x  |    |    |     | 1024 | 1024 | 1024
bdos(p, 0)      |     | x  |    |     | 1024 | 1024 | 1024
bdos(p, 0)      |     |    | x  |     |  -1  |  -1  | -1
bdos(p, 0)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
bdos(p, 0)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
bdos(p, 0)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
----------------|-----|----|----|-----|------|------|-----
bdos(p, 1)      |  x  |    |    |     | 1024 | 1024 | 1024
bdos(p, 1)      |     | x  |    |     | 1024 | 1024 | 1024
bdos(p, 1)      |     |    | x  |     |  -1  |  -1  | -1
bdos(p, 1)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
bdos(p, 1)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
bdos(p, 1)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
----------------|-----|----|----|-----|------|------|-----
bos(p->a_ent, 0)|  x  |    |    |     |  996 | 996  | 996
bos(p->a_ent, 0)|     | x  |    |     |  -1  |  -1  | -1
bos(p->a_ent, 0)|     |    | x  |     |  -1  |  -1  | -1
bos(p->a_ent, 0)|  x  |    |    |  x  |  996 | 996  | 996
bos(p->a_ent, 0)|     | x  |    |  x  |  -1  |  -1  | -1
bos(p->a_ent, 0)|     |    | x  |  x  |  -1  |  -1  | -1
----------------|-----|----|----|-----|------|------|-----
bos(p->a_ent, 1)|  x  |    |    |     |  996 | 996  | 992
bos(p->a_ent, 1)|     | x  |    |     |  -1  |  -1  | -1
bos(p->a_ent, 1)|     |    | x  |     |  -1  |  -1  | -1
bos(p->a_ent, 1)|  x  |    |    |  x  |  996 | 996  | 992
bos(p->a_ent, 1)|     | x  |    |  x  |  -1  |  -1  | -1
bos(p->a_ent, 1)|     |    | x  |  x  |  -1  |  -1  | -1
----------------|-----|----|----|-----|------|------|-----
bdos(p->a_ent,0)|  x  |    |    |     |  996 | 996  | 996
bdos(p->a_ent,0)|     | x  |    |     |  996 | 996  | 996
bdos(p->a_ent,0)|     |    | x  |     |  -1  |  -1  | -1
bdos(p->a_ent,0)|  x  |    |    |  x  |   8  |  8   |  8
bdos(p->a_ent,0)|     | x  |    |  x  |   8  |  8   |  8
bdos(p->a_ent,0)|     |    | x  |  x  |   8  |  8   |  8
----------------|-----|----|----|-----|------|------|-----
bdos(p->a_ent,1)|  x  |    |    |     |  996 | 996  | 992
bdos(p->a_ent,1)|     | x  |    |     |  996 | 996  | 992
bdos(p->a_ent,1)|     |    | x  |     |  -1  |  -1  | -1
bdos(p->a_ent,1)|  x  |    |    |  x  |   8  |  8   |  8
bdos(p->a_ent,1)|     | x  |    |  x  |   8  |  8   |  8
bdos(p->a_ent,1)|     |    | x  |  x  |   8  |  8   |  8
----------------|-----|----|----|-----|------|------|-----

bos only uses the allocation size to give it's answers. It only works if
it is a compile time known constant. bos also does not utilize the
__counted_by attribute.

bdos on the other hand allows the allocation size to be runtime known.
It also makes use of the __counted_by attribute if present, which always
takes precedence over the allocation size when the compiler supports it
for the particular case. So in those cases you can "lie" to the compiler
about the size of an object.

clang supports the __counted_by attribute for both cases (p and
p->a_entries). gcc only supports it for p->a_entries cases.



Issue A (clang)
=======

function        |comp.|run.|none|count| gcc  |clang |correct
----------------|-----|----|----|-----|------|------|-----
bdos(p, 0)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
bdos(p, 0)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
bdos(p, 0)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
bdos(p, 1)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
bdos(p, 1)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
bdos(p, 1)      |     |    | x  |  x  |  -1  |  36  | 43 / 40

These cases also represent the "bdos off by 4" issue in clang. clang
will compute these results using:

max(sizeof(struct posix_acl), offsetof(struct posix_acl, a_entries) +
count * sizeof(struct posix_acl_entries)) = 36

The kernel on the other hand expects this behavior:

sizeof(struct posix_acl) + count * sizeof(struct posix_acl_entries) = 40


I think the correct calculation would actually be this:

offsetof(struct posix_acl, a_entries)
+ (acl->a_count + 1) * sizeof(struct posix_acl_entry) - 1 = 43

The C11 standard says that when the . or -> operator is used on a struct
with an FAM it behaves like the FAM was replaced with the largest array
(with the same element type) that would not make the object any larger
(see page 113 and 114 of [2]).
So there are actually multiple sizes of the object that are consistent
with a count of 1.

malloc-max = maximum size of the object
malloc-min = minimum size of the object
FAME = flexible array member element
(FAME) = hypothetical 2nd FAME

<-----------------malloc-max-------------->
<-----------------malloc-min------->
<------sizeof(posix_acl)------->
                            <-FAME-><(FAME)>

The clang documentation of type 0 (vs type 2) bdos says this:

If ``type & 2 == 0``, the least ``n`` is returned such that accesses to 
   ``(const char*)ptr + n`` and beyond are known to be out of bounds.

We only _know_ that that access to the last byte of a 2nd hypothetical FAME
would be out of bounds. All the bytes before that are padding that is
allowed by the standard.


However, also this calculation doesn't get the kernel out
of trouble here. While this would fix the issue for this particular
struct it does not solve it for all structs:

What if the elements of the FAM were chars instead of
struct posix_acl_entries here? In that case the kernel is back to
overestimating the size of the struct / underreporting the count to the
compiler. So while I think this answer is more correct it doesn't
actually solve the issue.

Example:
Let's say the kernel allocates one of these posix_acl_char structs for a
single char in the array:

malloc(sizeof(posix_acl_char) + 1 * sizeof(char)) = 33

The C standard actually says that this object will behave like this when
the FAM is accessed:

struct posix_acl {
    refcount_t a_refcount;
    struct rcu_head a_rcu;
    unsigned int a_count;
    char a_entries[5];
};

a_count should be set to 5, not 1!


So we would really need an option to tell the compiler to use the same
size calculation as the kernel expects here, or maybe be able to specify
an offset in the __counted_by attribute. Alternatively clang could use
an option to disable the use of __counted_by for cases where the whole
struct is passed. This would make it behave like gcc.



Issue B (clang + gcc)
=======

A less serious issue happens with these cases:

function        |comp.|run.|none|count| gcc  |clang |correct
----------------|-----|----|----|-----|------|------|-----
bos(p->a_ent, 1)|  x  |    |    |     |  996 | 996  | 992
bos(p->a_ent, 1)|  x  |    |    |  x  |  996 | 996  | 992
bdos(p->a_ent,1)|  x  |    |    |     |  996 | 996  | 992
bdos(p->a_ent,1)|     | x  |    |     |  996 | 996  | 992

In this case the size returned by bos/bdos is too large, so this won't
lead to false positives. Both clang and gcc simply compute the difference
between the pointer from the start of the FAM to the end of the whole
struct. I believe this is wrong. According to the C standard the object
should behave like the FAM was replaced with the largest array that does
not make the object any larger. The size of that array is 124 elements.
So the posix_acl becomes:

struct posix_acl {
    refcount_t a_refcount;
    struct rcu_head a_rcu;
    unsigned int a_count;
    struct posix_acl_entry a_entries[124];
};

Since this is a type 1 bos/bdos it should return the size of just the
array, which is 124 * 8 = 992, and not 124.5 * 8 = 996.

[1] https://godbolt.org/z/a5eM3z8PY
[2] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf

Best Regards
Jan
Jan Hendrik Farr Oct. 7, 2024, 3:10 p.m. UTC | #33
On 07 05:56:46, Jan Hendrik Farr wrote:
> > I want to separate several easily confused issues. Instead of just
> > saying __bdos, let's clearly refer to what calculation within bdos is
> > being used. There are 3 choices currently:
> > - alloc_size attribute
> > - counted_by attribute
> > - fallback to __bos (which is similar to sizeof(), except that FAMs are 0 sized)
> > 
> > Additionally there are (for all intents and purposes) 2 size
> > determinations to be made by __bos and __bdos, via argument 2:
> > - containing object size (type 0) ("maximum size")
> > - specific object size (type 1) ("minimum size")
> 
> "maximum" vs "minimum" size would by type 0 vs type 2, but I think you
> do mean type 0 and type 1 as those are the types currently used by
> __struct_size and __member_size. Those are both "maximum" sizes.
> 
> > 
> > For example, consider:
> > 
> > struct posix_acl *acl = malloc(1024);
> > acl->a_count = 1;
> > 
> > what should these return:
> > 
> > 	__bos(acl, 0)
> > 	__bos(acl, 1)
> > 	__bdos(acl, 0)
> > 	__bdos(acl, 1)
> > 	__bos(acl->a_entries, 0)
> > 	__bos(acl->a_entries, 1)
> > 	__bdos(acl->a_entries, 0)
> > 	__bdos(acl->a_entries, 1)
> > 
> 
> I gathered some data from clang and gcc on all for all these cases and
> additionally varied whether the allocation size is a compile time known
> constant, runtime known, or not known. I also varied whether
> __counted_by was used.
> 
> Source code: [1]
> 
> 
> Abbreviations:
> 
> FAM      = flexible array member
> -1       = SIZE_MAX
> p->a_ent = p->a_entries
> comp.    = allocation size is compile time known
> run.     = allocation size is compile time known
> none     = allocation size is unknown
> count    = __counted_by attribute in use
> correct  = What I think the correct answers should be. In some places I
> have two answers. In that case the second number is what the kernel
> currently expects.
> 
> 
> And here's the data:
> 
> function        |comp.|run.|none|count| gcc  |clang |correct
> ----------------|-----|----|----|-----|------|------|-----
> bos(p, 0)       |  x  |    |    |     | 1024 | 1024 | 1024
> bos(p, 0)       |     | x  |    |     |  -1  |  -1  | -1
> bos(p, 0)       |     |    | x  |     |  -1  |  -1  | -1
> bos(p, 0)       |  x  |    |    |  x  | 1024 | 1024 | 1024
> bos(p, 0)       |     | x  |    |  x  |  -1  |  -1  | -1
> bos(p, 0)       |     |    | x  |  x  |  -1  |  -1  | -1
> ----------------|-----|----|----|-----|------|------|-----
> bos(p, 1)       |  x  |    |    |     | 1024 | 1024 | 1024
> bos(p, 1)       |     | x  |    |     |  -1  |  -1  | -1
> bos(p, 1)       |     |    | x  |     |  -1  |  -1  | -1
> bos(p, 1)       |  x  |    |    |  x  | 1024 | 1024 | 1024
> bos(p, 1)       |     | x  |    |  x  |  -1  |  -1  | -1
> bos(p, 1)       |     |    | x  |  x  |  -1  |  -1  | -1
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p, 0)      |  x  |    |    |     | 1024 | 1024 | 1024
> bdos(p, 0)      |     | x  |    |     | 1024 | 1024 | 1024
> bdos(p, 0)      |     |    | x  |     |  -1  |  -1  | -1
> bdos(p, 0)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 0)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 0)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p, 1)      |  x  |    |    |     | 1024 | 1024 | 1024
> bdos(p, 1)      |     | x  |    |     | 1024 | 1024 | 1024
> bdos(p, 1)      |     |    | x  |     |  -1  |  -1  | -1
> bdos(p, 1)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 1)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 1)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> ----------------|-----|----|----|-----|------|------|-----
> bos(p->a_ent, 0)|  x  |    |    |     |  996 | 996  | 996
> bos(p->a_ent, 0)|     | x  |    |     |  -1  |  -1  | -1
> bos(p->a_ent, 0)|     |    | x  |     |  -1  |  -1  | -1
> bos(p->a_ent, 0)|  x  |    |    |  x  |  996 | 996  | 996
> bos(p->a_ent, 0)|     | x  |    |  x  |  -1  |  -1  | -1
> bos(p->a_ent, 0)|     |    | x  |  x  |  -1  |  -1  | -1
> ----------------|-----|----|----|-----|------|------|-----
> bos(p->a_ent, 1)|  x  |    |    |     |  996 | 996  | 992
> bos(p->a_ent, 1)|     | x  |    |     |  -1  |  -1  | -1
> bos(p->a_ent, 1)|     |    | x  |     |  -1  |  -1  | -1
> bos(p->a_ent, 1)|  x  |    |    |  x  |  996 | 996  | 992
> bos(p->a_ent, 1)|     | x  |    |  x  |  -1  |  -1  | -1
> bos(p->a_ent, 1)|     |    | x  |  x  |  -1  |  -1  | -1
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p->a_ent,0)|  x  |    |    |     |  996 | 996  | 996
> bdos(p->a_ent,0)|     | x  |    |     |  996 | 996  | 996
> bdos(p->a_ent,0)|     |    | x  |     |  -1  |  -1  | -1
> bdos(p->a_ent,0)|  x  |    |    |  x  |   8  |  8   |  8
> bdos(p->a_ent,0)|     | x  |    |  x  |   8  |  8   |  8
> bdos(p->a_ent,0)|     |    | x  |  x  |   8  |  8   |  8


These previous three should probably actually be like this:
bdos(p->a_ent,0)|  x  |    |    |  x  |   8  |  8   |  15
bdos(p->a_ent,0)|     | x  |    |  x  |   8  |  8   |  15
bdos(p->a_ent,0)|     |    | x  |  x  |   8  |  8   |  15

They should include the allowed padding after the FAM, as this is a type
0 bdos. Not really an issue here, as the kernel expects 8 here.


> ----------------|-----|----|----|-----|------|------|-----
> bdos(p->a_ent,1)|  x  |    |    |     |  996 | 996  | 992
> bdos(p->a_ent,1)|     | x  |    |     |  996 | 996  | 992
> bdos(p->a_ent,1)|     |    | x  |     |  -1  |  -1  | -1
> bdos(p->a_ent,1)|  x  |    |    |  x  |   8  |  8   |  8
> bdos(p->a_ent,1)|     | x  |    |  x  |   8  |  8   |  8
> bdos(p->a_ent,1)|     |    | x  |  x  |   8  |  8   |  8
> ----------------|-----|----|----|-----|------|------|-----
> 
> bos only uses the allocation size to give it's answers. It only works if
> it is a compile time known constant. bos also does not utilize the
> __counted_by attribute.
> 
> bdos on the other hand allows the allocation size to be runtime known.
> It also makes use of the __counted_by attribute if present, which always
> takes precedence over the allocation size when the compiler supports it
> for the particular case. So in those cases you can "lie" to the compiler
> about the size of an object.
> 
> clang supports the __counted_by attribute for both cases (p and
> p->a_entries). gcc only supports it for p->a_entries cases.
> 
> 
> 
> Issue A (clang)
> =======
> 
> function        |comp.|run.|none|count| gcc  |clang |correct
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p, 0)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 0)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 0)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> bdos(p, 1)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 1)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 1)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> 
> These cases also represent the "bdos off by 4" issue in clang. clang
> will compute these results using:
> 
> max(sizeof(struct posix_acl), offsetof(struct posix_acl, a_entries) +
> count * sizeof(struct posix_acl_entries)) = 36
> 
> The kernel on the other hand expects this behavior:
> 
> sizeof(struct posix_acl) + count * sizeof(struct posix_acl_entries) = 40
> 
> 
> I think the correct calculation would actually be this:
> 
> offsetof(struct posix_acl, a_entries)
> + (acl->a_count + 1) * sizeof(struct posix_acl_entry) - 1 = 43
> 
> The C11 standard says that when the . or -> operator is used on a struct
> with an FAM it behaves like the FAM was replaced with the largest array
> (with the same element type) that would not make the object any larger
> (see page 113 and 114 of [2]).
> So there are actually multiple sizes of the object that are consistent
> with a count of 1.
> 
> malloc-max = maximum size of the object
> malloc-min = minimum size of the object
> FAME = flexible array member element
> (FAME) = hypothetical 2nd FAME
> 
> <-----------------malloc-max-------------->
> <-----------------malloc-min------->
> <------sizeof(posix_acl)------->
>                             <-FAME-><(FAME)>
> 
> The clang documentation of type 0 (vs type 2) bdos says this:
> 
> If ``type & 2 == 0``, the least ``n`` is returned such that accesses to 
>    ``(const char*)ptr + n`` and beyond are known to be out of bounds.
> 
> We only _know_ that that access to the last byte of a 2nd hypothetical FAME
> would be out of bounds. All the bytes before that are padding that is
> allowed by the standard.
> 
> 
> However, also this calculation doesn't get the kernel out
> of trouble here. While this would fix the issue for this particular
> struct it does not solve it for all structs:
> 
> What if the elements of the FAM were chars instead of
> struct posix_acl_entries here? In that case the kernel is back to
> overestimating the size of the struct / underreporting the count to the
> compiler. So while I think this answer is more correct it doesn't
> actually solve the issue.
> 
> Example:
> Let's say the kernel allocates one of these posix_acl_char structs for a
> single char in the array:
> 
> malloc(sizeof(posix_acl_char) + 1 * sizeof(char)) = 33
> 
> The C standard actually says that this object will behave like this when
> the FAM is accessed:
> 
> struct posix_acl {
>     refcount_t a_refcount;
>     struct rcu_head a_rcu;
>     unsigned int a_count;
>     char a_entries[5];
> };
> 
> a_count should be set to 5, not 1!
> 
> 
> So we would really need an option to tell the compiler to use the same
> size calculation as the kernel expects here, or maybe be able to specify
> an offset in the __counted_by attribute. Alternatively clang could use
> an option to disable the use of __counted_by for cases where the whole
> struct is passed. This would make it behave like gcc.
> 
> 
> 
> Issue B (clang + gcc)
> =======
> 
> A less serious issue happens with these cases:
> 
> function        |comp.|run.|none|count| gcc  |clang |correct
> ----------------|-----|----|----|-----|------|------|-----
> bos(p->a_ent, 1)|  x  |    |    |     |  996 | 996  | 992
> bos(p->a_ent, 1)|  x  |    |    |  x  |  996 | 996  | 992
> bdos(p->a_ent,1)|  x  |    |    |     |  996 | 996  | 992
> bdos(p->a_ent,1)|     | x  |    |     |  996 | 996  | 992
> 
> In this case the size returned by bos/bdos is too large, so this won't
> lead to false positives. Both clang and gcc simply compute the difference
> between the pointer from the start of the FAM to the end of the whole
> struct. I believe this is wrong. According to the C standard the object
> should behave like the FAM was replaced with the largest array that does
> not make the object any larger. The size of that array is 124 elements.
> So the posix_acl becomes:
> 
> struct posix_acl {
>     refcount_t a_refcount;
>     struct rcu_head a_rcu;
>     unsigned int a_count;
>     struct posix_acl_entry a_entries[124];
> };
> 
> Since this is a type 1 bos/bdos it should return the size of just the
> array, which is 124 * 8 = 992, and not 124.5 * 8 = 996.
> 
> [1] https://godbolt.org/z/a5eM3z8PY
> [2] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
> 
> Best Regards
> Jan
>
Bill Wendling Oct. 14, 2024, 9:39 p.m. UTC | #34
On Fri, Oct 4, 2024 at 10:13 AM Kees Cook <kees@kernel.org> wrote:
> On Thu, Oct 03, 2024 at 11:48:18PM +0200, Jan Hendrik Farr wrote:
> > On 03 14:28:01, Kees Cook wrote:
> > > On Thu, Oct 03, 2024 at 05:17:08PM +0200, Jan Hendrik Farr wrote:
> > > > gcc currently says that the __bdos of struct containing a flexible array
> > > > member is:
> > > >
> > > > sizeof(<whole struct>) + sizeof(<flexible array element>) * <count>
> > > >
> > > > clang however does the following:
> > > >
> > > > max(sizeof(<whole struct>), offsetof(<flexible array member>) + sizeof(<flexible array element>) * <count>)
> > >
> > > Clang's calculation seems very wrong. I would expect it to match GCC's.
> > >
> >
> > I was on the very same train of thought, but I have since changed my
> > mind a bit. A struct containing a flexible array member can be allocated in
> > two ways:
> >
> > (1):
> >
> > struct posix_acl *acl = malloc(sizeof(struct posix_acl) + sizeof(struct posix_acl_entry) * 1);
> > acl.a_count = 1;
> >
> > or (2):
> >
> > struct posix_acl *acl = malloc(offsetof(struct posix_acl, a_entries) + sizeof(struct posix_acl_entry) * 1);
> > acl.a_count = 1;
> >
> > Both are valid ways to allocate it. __bdos does not know which of these
> > methods was used to allocate the struct whose size it has to determine,
> > so it's giving the lower bound that doesn't include the (potential)
> > padding at the end.
>
Slightly off topic: while I was looking at the definition for
struct_size, I noticed this:

#define flex_array_size(p, member, count)                               \
        __builtin_choose_expr(__is_constexpr(count),                    \
                (count) * sizeof(*(p)->member) +
__must_be_array((p)->member),  \
                size_mul(count, sizeof(*(p)->member) +
__must_be_array((p)->member)))

In particular the 'size_mul' line. I realize that '__must_be_array'
will return a '0' or static compiler error, but it seems inconsistent
that it's included as part of 'size_mul'. Maybe something like this
instead (notice the parens on the last line)?

#define flex_array_size(p, member, count)                               \
        __builtin_choose_expr(__is_constexpr(count),                    \
                (count) * sizeof(*(p)->member) +
__must_be_array((p)->member),  \
                size_mul(count, sizeof(*(p)->member)) +
__must_be_array((p)->member))


> I want to separate several easily confused issues. Instead of just
> saying __bdos, let's clearly refer to what calculation within bdos is
> being used. There are 3 choices currently:
> - alloc_size attribute
> - counted_by attribute
> - fallback to __bos (which is similar to sizeof(), except that FAMs are 0 sized)
>
> Additionally there are (for all intents and purposes) 2 size
> determinations to be made by __bos and __bdos, via argument 2:
> - containing object size (type 0) ("maximum size")
> - specific object size (type 1) ("minimum size")
>
> For example, consider:
>
> struct posix_acl *acl = malloc(1024);
> acl->a_count = 1;
>
> what should these return:
>
>         __bos(acl, 0)
>         __bos(acl, 1)
>         __bdos(acl, 0)
>         __bdos(acl, 1)
>         __bos(acl->a_entries, 0)
>         __bos(acl->a_entries, 1)
>         __bdos(acl->a_entries, 0)
>         __bdos(acl->a_entries, 1)
>
> > So it comes down to false positives vs false negatives...
> > More details here:
> > https://github.com/llvm/llvm-project/pull/111015
> >
> > Clangs current behavior would essentially force kernel code to always
> > assume option (2) is used. So
> >
> > struct posix_acl *
> > posix_acl_clone(const struct posix_acl *acl, gfp_t flags)
> > {
> >       struct posix_acl *clone = NULL;
> >
> >       if (acl) {
> >               int size = sizeof(struct posix_acl) + acl->a_count *
> >                          sizeof(struct posix_acl_entry);
> >               clone = kmemdup(acl, size, flags);
> >               if (clone)
> >                       refcount_set(&clone->a_refcount, 1);
> >       }
> >       return clone;
> > }
> > EXPORT_SYMBOL_GPL(posix_acl_clone);
> >
> > from linux/fs/posix_acl.c would have to turn into something like:
> >
> > struct posix_acl *
> > posix_acl_clone(const struct posix_acl *acl, gfp_t flags)
> > {
> >       struct posix_acl *clone = NULL;
> >
> >       if (acl) {
> >               int size = offsetof(struct posix_acl, a_entries) + acl->a_count *
> >                          sizeof(struct posix_acl_entry);
> >               clone = kmemdup(acl, size, flags);
> >               if (clone)
> >                       refcount_set(&clone->a_refcount, 1);
> >       }
> >       return clone;
> > }
> > EXPORT_SYMBOL_GPL(posix_acl_clone);
> >
> > Which is actually safer, because can you actually be sure this posix_acl
> > wasn't allocated using method (2)?
>
> First, this should not be using an open coded calculation at all; it
> should use the struct_size() macro.
>
> Secondly, if we want to change struct_size(), then we must (via
> allmodconfig builds) determine all the places in the kernel
> where the calculated size changes, and audit those for safety.
>
> Right now, struct_size() over-estimates in the face of padding.
>
> We're already moving the kernel toward not even calling struct_size()
> externally from the allocation, and instead using the it within the
> allocation macros themselves:
> https://lore.kernel.org/lkml/20240822231324.make.666-kees@kernel.org/
>
> > After looking at the assembly produced by gcc more, it actually looks
> > like it's using the allocation size if it's known in the current context
> > (for example if the struct was just malloced in the same function)
> > and otherwise returns INT_MAX for the __bdos of a struct containing a
> > flexible array member. It's only returning the size based on the
> > __counted_by attribute of you ask it for the __bdos of the flexible
> > array member itself.
>
> Here is my test case for all the corner cases we've found so far:
> https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c
>
> I'd prefer we add cases there so we can all be talking about the same
> things. :)
>
> -Kees
>
> --
> Kees Cook
Bill Wendling Oct. 16, 2024, 1:22 a.m. UTC | #35
On Thu, Oct 3, 2024 at 4:33 AM Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> On 02 11:18:57, Thorsten Blum wrote:
> > On 28. Sep 2024, at 22:34, Kees Cook <kees@kernel.org> wrote:
> > > [...]
> > >
> > > Sorry, I've been out of commission with covid. Globally disabling this
> > > macro for clang is not the right solution (way too big a hammer).
> > >
> > > Until Bill has a fix, we can revert commit
> > > 86e92eeeb23741a072fe7532db663250ff2e726a, as the problem is limited to
> > > certain situations where 'counted_by' is in use.
> >
> > I already encountered two other related __counted_by() issues [1][2]
> > that are now being reverted. Would it be an option to disable it
> > globally, but only for Clang < v19 (where it looks like it'll be fixed)?
> >
> > Otherwise adding __counted_by() might be a slippery slope for a long
> > time and the edge cases don't seem to be that rare anymore.
> >
> > Thanks,
> > Thorsten
> >
> > [1] https://lore.kernel.org/all/20240909162725.1805-2-thorsten.blum@toblux.com/
> > [2] https://lore.kernel.org/all/20240923213809.235128-2-thorsten.blum@linux.dev/
>
> This issue is now fixed on the llvm main branch:
> https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a
>
> So presumably this will go into 19.1.2, not sure what this means for
> distros that ship clang 18. Will they have to be notified to backport
> this?
>
FYI, Clang 19.1.2 shipped with your fix in it.

-bw
Jan Hendrik Farr Oct. 16, 2024, 2:18 a.m. UTC | #36
On 15 18:22:50, Bill Wendling wrote:
> On Thu, Oct 3, 2024 at 4:33 AM Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > On 02 11:18:57, Thorsten Blum wrote:
> > > On 28. Sep 2024, at 22:34, Kees Cook <kees@kernel.org> wrote:
> > > > [...]
> > > >
> > > > Sorry, I've been out of commission with covid. Globally disabling this
> > > > macro for clang is not the right solution (way too big a hammer).
> > > >
> > > > Until Bill has a fix, we can revert commit
> > > > 86e92eeeb23741a072fe7532db663250ff2e726a, as the problem is limited to
> > > > certain situations where 'counted_by' is in use.
> > >
> > > I already encountered two other related __counted_by() issues [1][2]
> > > that are now being reverted. Would it be an option to disable it
> > > globally, but only for Clang < v19 (where it looks like it'll be fixed)?
> > >
> > > Otherwise adding __counted_by() might be a slippery slope for a long
> > > time and the edge cases don't seem to be that rare anymore.
> > >
> > > Thanks,
> > > Thorsten
> > >
> > > [1] https://lore.kernel.org/all/20240909162725.1805-2-thorsten.blum@toblux.com/
> > > [2] https://lore.kernel.org/all/20240923213809.235128-2-thorsten.blum@linux.dev/
> >
> > This issue is now fixed on the llvm main branch:
> > https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a
> >
> > So presumably this will go into 19.1.2, not sure what this means for
> > distros that ship clang 18. Will they have to be notified to backport
> > this?
> >
> FYI, Clang 19.1.2 shipped with your fix in it.
> 

Thx for the info.

How should we continue with the "off by 4" issue? The way I see it either
the kernel has to change struct_size (lots of work) or clang has to get
an option to follow the kernels behavior. I'm in favor of adding an
option to clang.

Ideally I think it shouldn't be a global option but one that you can
make per __bdos invocation. So either inlcude it in type or create a
separate builtin for it.

What are your thoughts on this?


Best Regards
Jan
Kees Cook Oct. 16, 2024, 8:43 p.m. UTC | #37
On Wed, Oct 16, 2024 at 04:18:19AM +0200, Jan Hendrik Farr wrote:
> On 15 18:22:50, Bill Wendling wrote:
> > On Thu, Oct 3, 2024 at 4:33 AM Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > > On 02 11:18:57, Thorsten Blum wrote:
> > > > On 28. Sep 2024, at 22:34, Kees Cook <kees@kernel.org> wrote:
> > > > > [...]
> > > > >
> > > > > Sorry, I've been out of commission with covid. Globally disabling this
> > > > > macro for clang is not the right solution (way too big a hammer).
> > > > >
> > > > > Until Bill has a fix, we can revert commit
> > > > > 86e92eeeb23741a072fe7532db663250ff2e726a, as the problem is limited to
> > > > > certain situations where 'counted_by' is in use.
> > > >
> > > > I already encountered two other related __counted_by() issues [1][2]
> > > > that are now being reverted. Would it be an option to disable it
> > > > globally, but only for Clang < v19 (where it looks like it'll be fixed)?
> > > >
> > > > Otherwise adding __counted_by() might be a slippery slope for a long
> > > > time and the edge cases don't seem to be that rare anymore.
> > > >
> > > > Thanks,
> > > > Thorsten
> > > >
> > > > [1] https://lore.kernel.org/all/20240909162725.1805-2-thorsten.blum@toblux.com/
> > > > [2] https://lore.kernel.org/all/20240923213809.235128-2-thorsten.blum@linux.dev/
> > >
> > > This issue is now fixed on the llvm main branch:
> > > https://github.com/llvm/llvm-project/commit/882457a2eedbe6d53161b2f78fcf769fc9a93e8a
> > >
> > > So presumably this will go into 19.1.2, not sure what this means for
> > > distros that ship clang 18. Will they have to be notified to backport
> > > this?
> > >
> > FYI, Clang 19.1.2 shipped with your fix in it.
> > 
> 
> Thx for the info.
> 
> How should we continue with the "off by 4" issue? The way I see it either
> the kernel has to change struct_size (lots of work) or clang has to get
> an option to follow the kernels behavior. I'm in favor of adding an
> option to clang.

I'm planning on checking how much impact there is on the kernel to fix
struct_size() to be precise. It really does need to match __bdos for
Clang and GCC.

-Kees
Kees Cook Oct. 16, 2024, 9:13 p.m. UTC | #38
On Mon, Oct 07, 2024 at 05:10:53PM +0200, Jan Hendrik Farr wrote:
> On 07 05:56:46, Jan Hendrik Farr wrote:
> > > I want to separate several easily confused issues. Instead of just
> > > saying __bdos, let's clearly refer to what calculation within bdos is
> > > being used. There are 3 choices currently:
> > > - alloc_size attribute
> > > - counted_by attribute
> > > - fallback to __bos (which is similar to sizeof(), except that FAMs are 0 sized)
> > > 
> > > Additionally there are (for all intents and purposes) 2 size
> > > determinations to be made by __bos and __bdos, via argument 2:
> > > - containing object size (type 0) ("maximum size")
> > > - specific object size (type 1) ("minimum size")
> > 
> > "maximum" vs "minimum" size would by type 0 vs type 2, but I think you
> > do mean type 0 and type 1 as those are the types currently used by
> > __struct_size and __member_size. Those are both "maximum" sizes.
> > 
> > > 
> > > For example, consider:
> > > 
> > > struct posix_acl *acl = malloc(1024);
> > > acl->a_count = 1;
> > > 
> > > what should these return:
> > > 
> > > 	__bos(acl, 0)
> > > 	__bos(acl, 1)
> > > 	__bdos(acl, 0)
> > > 	__bdos(acl, 1)
> > > 	__bos(acl->a_entries, 0)
> > > 	__bos(acl->a_entries, 1)
> > > 	__bdos(acl->a_entries, 0)
> > > 	__bdos(acl->a_entries, 1)
> > > 
> > 
> > I gathered some data from clang and gcc on all for all these cases and
> > additionally varied whether the allocation size is a compile time known
> > constant, runtime known, or not known. I also varied whether
> > __counted_by was used.
> > 
> > Source code: [1]
> > 
> > 
> > Abbreviations:
> > 
> > FAM      = flexible array member
> > -1       = SIZE_MAX
> > p->a_ent = p->a_entries
> > comp.    = allocation size is compile time known
> > run.     = allocation size is compile time known
> > none     = allocation size is unknown
> > count    = __counted_by attribute in use
> > correct  = What I think the correct answers should be. In some places I
> > have two answers. In that case the second number is what the kernel
> > currently expects.
> > 
> > 
> > And here's the data:
> > 
> > function        |comp.|run.|none|count| gcc  |clang |correct
> > ----------------|-----|----|----|-----|------|------|-----
> > bos(p, 0)       |  x  |    |    |     | 1024 | 1024 | 1024
> > bos(p, 0)       |     | x  |    |     |  -1  |  -1  | -1
> > bos(p, 0)       |     |    | x  |     |  -1  |  -1  | -1
> > bos(p, 0)       |  x  |    |    |  x  | 1024 | 1024 | 1024
> > bos(p, 0)       |     | x  |    |  x  |  -1  |  -1  | -1
> > bos(p, 0)       |     |    | x  |  x  |  -1  |  -1  | -1
> > ----------------|-----|----|----|-----|------|------|-----
> > bos(p, 1)       |  x  |    |    |     | 1024 | 1024 | 1024
> > bos(p, 1)       |     | x  |    |     |  -1  |  -1  | -1
> > bos(p, 1)       |     |    | x  |     |  -1  |  -1  | -1
> > bos(p, 1)       |  x  |    |    |  x  | 1024 | 1024 | 1024
> > bos(p, 1)       |     | x  |    |  x  |  -1  |  -1  | -1
> > bos(p, 1)       |     |    | x  |  x  |  -1  |  -1  | -1
> > ----------------|-----|----|----|-----|------|------|-----
> > bdos(p, 0)      |  x  |    |    |     | 1024 | 1024 | 1024
> > bdos(p, 0)      |     | x  |    |     | 1024 | 1024 | 1024
> > bdos(p, 0)      |     |    | x  |     |  -1  |  -1  | -1
> > bdos(p, 0)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> > bdos(p, 0)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> > bdos(p, 0)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> > ----------------|-----|----|----|-----|------|------|-----
> > bdos(p, 1)      |  x  |    |    |     | 1024 | 1024 | 1024
> > bdos(p, 1)      |     | x  |    |     | 1024 | 1024 | 1024
> > bdos(p, 1)      |     |    | x  |     |  -1  |  -1  | -1
> > bdos(p, 1)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> > bdos(p, 1)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> > bdos(p, 1)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> > ----------------|-----|----|----|-----|------|------|-----
> > bos(p->a_ent, 0)|  x  |    |    |     |  996 | 996  | 996
> > bos(p->a_ent, 0)|     | x  |    |     |  -1  |  -1  | -1
> > bos(p->a_ent, 0)|     |    | x  |     |  -1  |  -1  | -1
> > bos(p->a_ent, 0)|  x  |    |    |  x  |  996 | 996  | 996
> > bos(p->a_ent, 0)|     | x  |    |  x  |  -1  |  -1  | -1
> > bos(p->a_ent, 0)|     |    | x  |  x  |  -1  |  -1  | -1
> > ----------------|-----|----|----|-----|------|------|-----
> > bos(p->a_ent, 1)|  x  |    |    |     |  996 | 996  | 992
> > bos(p->a_ent, 1)|     | x  |    |     |  -1  |  -1  | -1
> > bos(p->a_ent, 1)|     |    | x  |     |  -1  |  -1  | -1
> > bos(p->a_ent, 1)|  x  |    |    |  x  |  996 | 996  | 992
> > bos(p->a_ent, 1)|     | x  |    |  x  |  -1  |  -1  | -1
> > bos(p->a_ent, 1)|     |    | x  |  x  |  -1  |  -1  | -1
> > ----------------|-----|----|----|-----|------|------|-----
> > bdos(p->a_ent,0)|  x  |    |    |     |  996 | 996  | 996
> > bdos(p->a_ent,0)|     | x  |    |     |  996 | 996  | 996
> > bdos(p->a_ent,0)|     |    | x  |     |  -1  |  -1  | -1
> > bdos(p->a_ent,0)|  x  |    |    |  x  |   8  |  8   |  8
> > bdos(p->a_ent,0)|     | x  |    |  x  |   8  |  8   |  8
> > bdos(p->a_ent,0)|     |    | x  |  x  |   8  |  8   |  8
> 
> 
> These previous three should probably actually be like this:
> bdos(p->a_ent,0)|  x  |    |    |  x  |   8  |  8   |  15
> bdos(p->a_ent,0)|     | x  |    |  x  |   8  |  8   |  15
> bdos(p->a_ent,0)|     |    | x  |  x  |   8  |  8   |  15
> 
> They should include the allowed padding after the FAM, as this is a type
> 0 bdos. Not really an issue here, as the kernel expects 8 here.
> 
> 
> > ----------------|-----|----|----|-----|------|------|-----
> > bdos(p->a_ent,1)|  x  |    |    |     |  996 | 996  | 992
> > bdos(p->a_ent,1)|     | x  |    |     |  996 | 996  | 992
> > bdos(p->a_ent,1)|     |    | x  |     |  -1  |  -1  | -1
> > bdos(p->a_ent,1)|  x  |    |    |  x  |   8  |  8   |  8
> > bdos(p->a_ent,1)|     | x  |    |  x  |   8  |  8   |  8
> > bdos(p->a_ent,1)|     |    | x  |  x  |   8  |  8   |  8
> > ----------------|-----|----|----|-----|------|------|-----

Thanks for building all these tables -- I want to look at it all again
myself, but I'm pretty well convinced you've found a bunch of things we
need to sort out.

> > bos only uses the allocation size to give it's answers. It only works if
> > it is a compile time known constant. bos also does not utilize the
> > __counted_by attribute.
> > 
> > bdos on the other hand allows the allocation size to be runtime known.
> > It also makes use of the __counted_by attribute if present, which always
> > takes precedence over the allocation size when the compiler supports it
> > for the particular case. So in those cases you can "lie" to the compiler
> > about the size of an object.

I am okay with counted_by winning out over alloc_size when it is present.
However, I would expect this:

void *v = malloc(1024);
struct flex *p = v;
p->counter = 4;

__bdos(p, 1) == struct_size(f, array, 4)	// "p" has both size hints
__bdos(v, 1) == 1024				// "v" only has alloc_size

> > 
> > clang supports the __counted_by attribute for both cases (p and
> > p->a_entries). gcc only supports it for p->a_entries cases.

I think gcc refuses to believe "p" is anything until it has been
dereferenced to a specific type. I would like it if they could fix this,
as if we have a pointer to a type and we're using __bdos to query its
size, we are explicitly treating it as the type it is pointing at (i.e.
examining the counter and evaluating the FAM size).

> > 
> > 
> > 
> > Issue A (clang)
> > =======
> > 
> > function        |comp.|run.|none|count| gcc  |clang |correct
> > ----------------|-----|----|----|-----|------|------|-----
> > bdos(p, 0)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> > bdos(p, 0)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> > bdos(p, 0)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> > bdos(p, 1)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> > bdos(p, 1)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> > bdos(p, 1)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> > 
> > These cases also represent the "bdos off by 4" issue in clang. clang
> > will compute these results using:
> > 
> > max(sizeof(struct posix_acl), offsetof(struct posix_acl, a_entries) +
> > count * sizeof(struct posix_acl_entries)) = 36
> > 
> > The kernel on the other hand expects this behavior:
> > 
> > sizeof(struct posix_acl) + count * sizeof(struct posix_acl_entries) = 40
> > 
> > 
> > I think the correct calculation would actually be this:
> > 
> > offsetof(struct posix_acl, a_entries)
> > + (acl->a_count + 1) * sizeof(struct posix_acl_entry) - 1 = 43
> > 
> > The C11 standard says that when the . or -> operator is used on a struct
> > with an FAM it behaves like the FAM was replaced with the largest array
> > (with the same element type) that would not make the object any larger
> > (see page 113 and 114 of [2]).
> > So there are actually multiple sizes of the object that are consistent
> > with a count of 1.
> > 
> > malloc-max = maximum size of the object
> > malloc-min = minimum size of the object
> > FAME = flexible array member element
> > (FAME) = hypothetical 2nd FAME
> > 
> > <-----------------malloc-max-------------->
> > <-----------------malloc-min------->
> > <------sizeof(posix_acl)------->
> >                             <-FAME-><(FAME)>
> > 
> > The clang documentation of type 0 (vs type 2) bdos says this:
> > 
> > If ``type & 2 == 0``, the least ``n`` is returned such that accesses to 
> >    ``(const char*)ptr + n`` and beyond are known to be out of bounds.
> > 
> > We only _know_ that that access to the last byte of a 2nd hypothetical FAME
> > would be out of bounds. All the bytes before that are padding that is
> > allowed by the standard.
> > 
> > 
> > However, also this calculation doesn't get the kernel out
> > of trouble here. While this would fix the issue for this particular
> > struct it does not solve it for all structs:
> > 
> > What if the elements of the FAM were chars instead of
> > struct posix_acl_entries here? In that case the kernel is back to
> > overestimating the size of the struct / underreporting the count to the
> > compiler. So while I think this answer is more correct it doesn't
> > actually solve the issue.
> > 
> > Example:
> > Let's say the kernel allocates one of these posix_acl_char structs for a
> > single char in the array:
> > 
> > malloc(sizeof(posix_acl_char) + 1 * sizeof(char)) = 33
> > 
> > The C standard actually says that this object will behave like this when
> > the FAM is accessed:
> > 
> > struct posix_acl {
> >     refcount_t a_refcount;
> >     struct rcu_head a_rcu;
> >     unsigned int a_count;
> >     char a_entries[5];
> > };
> > 
> > a_count should be set to 5, not 1!

This is making my head spin. In a practical sense, a struct instance
has a fixed size (which I would say is the calculation using max(),
above). Whether the . or -> operators are used shouldn't matter (to
the program's view of the object size).

This "1 becomes 5" should just not be allowed. I can accept object padding
contributing to object size, but we should not redefine the array size
out from under what it actually is. I don't want array accesses into
padding either. :P

I think 36 is correct here, not 40 nor 43.

> > So we would really need an option to tell the compiler to use the same
> > size calculation as the kernel expects here, or maybe be able to specify
> > an offset in the __counted_by attribute. Alternatively clang could use
> > an option to disable the use of __counted_by for cases where the whole
> > struct is passed. This would make it behave like gcc.
> > 
> > 
> > 
> > Issue B (clang + gcc)
> > =======
> > 
> > A less serious issue happens with these cases:
> > 
> > function        |comp.|run.|none|count| gcc  |clang |correct
> > ----------------|-----|----|----|-----|------|------|-----
> > bos(p->a_ent, 1)|  x  |    |    |     |  996 | 996  | 992
> > bos(p->a_ent, 1)|  x  |    |    |  x  |  996 | 996  | 992
> > bdos(p->a_ent,1)|  x  |    |    |     |  996 | 996  | 992
> > bdos(p->a_ent,1)|     | x  |    |     |  996 | 996  | 992
> > 
> > In this case the size returned by bos/bdos is too large, so this won't
> > lead to false positives. Both clang and gcc simply compute the difference
> > between the pointer from the start of the FAM to the end of the whole
> > struct. I believe this is wrong. According to the C standard the object
> > should behave like the FAM was replaced with the largest array that does
> > not make the object any larger. The size of that array is 124 elements.
> > So the posix_acl becomes:
> > 
> > struct posix_acl {
> >     refcount_t a_refcount;
> >     struct rcu_head a_rcu;
> >     unsigned int a_count;
> >     struct posix_acl_entry a_entries[124];
> > };
> > 
> > Since this is a type 1 bos/bdos it should return the size of just the
> > array, which is 124 * 8 = 992, and not 124.5 * 8 = 996.

For type 1, I agree: this needs to be the precise size of the array.

As of now, are GCC and Clang both using:

max(sizeof(*p),
    offsetof(typeof(*p), FAM) + count * sizeof(*p->FAM))

?

Linux can adjust struct_size() to match, and I'll check that nothing
shrunk inappropriately...
Bill Wendling Oct. 16, 2024, 11:41 p.m. UTC | #39
On Sun, Oct 6, 2024 at 8:56 PM Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > I want to separate several easily confused issues. Instead of just
> > saying __bdos, let's clearly refer to what calculation within bdos is
> > being used. There are 3 choices currently:
> > - alloc_size attribute
> > - counted_by attribute
> > - fallback to __bos (which is similar to sizeof(), except that FAMs are 0 sized)
> >
> > Additionally there are (for all intents and purposes) 2 size
> > determinations to be made by __bos and __bdos, via argument 2:
> > - containing object size (type 0) ("maximum size")
> > - specific object size (type 1) ("minimum size")
>
> "maximum" vs "minimum" size would by type 0 vs type 2, but I think you
> do mean type 0 and type 1 as those are the types currently used by
> __struct_size and __member_size. Those are both "maximum" sizes.
>
> >
> > For example, consider:
> >
> > struct posix_acl *acl = malloc(1024);
> > acl->a_count = 1;
> >
> > what should these return:
> >
> >       __bos(acl, 0)
> >       __bos(acl, 1)
> >       __bdos(acl, 0)
> >       __bdos(acl, 1)
> >       __bos(acl->a_entries, 0)
> >       __bos(acl->a_entries, 1)
> >       __bdos(acl->a_entries, 0)
> >       __bdos(acl->a_entries, 1)
> >
>
Thank you for this detailed write-up! I'm sorry for my late response.

> I gathered some data from clang and gcc on all for all these cases and
> additionally varied whether the allocation size is a compile time known
> constant, runtime known, or not known. I also varied whether
> __counted_by was used.
>
> Source code: [1]
>
>
> Abbreviations:
>
> FAM      = flexible array member
> -1       = SIZE_MAX
> p->a_ent = p->a_entries
> comp.    = allocation size is compile time known
> run.     = allocation size is compile time known
> none     = allocation size is unknown
> count    = __counted_by attribute in use
> correct  = What I think the correct answers should be. In some places I
> have two answers. In that case the second number is what the kernel
> currently expects.
>
>
> And here's the data:
>
> function        |comp.|run.|none|count| gcc  |clang |correct
> ----------------|-----|----|----|-----|------|------|-----
> bos(p, 0)       |  x  |    |    |     | 1024 | 1024 | 1024
> bos(p, 0)       |     | x  |    |     |  -1  |  -1  | -1
> bos(p, 0)       |     |    | x  |     |  -1  |  -1  | -1
> bos(p, 0)       |  x  |    |    |  x  | 1024 | 1024 | 1024
> bos(p, 0)       |     | x  |    |  x  |  -1  |  -1  | -1
> bos(p, 0)       |     |    | x  |  x  |  -1  |  -1  | -1
> ----------------|-----|----|----|-----|------|------|-----
> bos(p, 1)       |  x  |    |    |     | 1024 | 1024 | 1024
> bos(p, 1)       |     | x  |    |     |  -1  |  -1  | -1
> bos(p, 1)       |     |    | x  |     |  -1  |  -1  | -1
> bos(p, 1)       |  x  |    |    |  x  | 1024 | 1024 | 1024
> bos(p, 1)       |     | x  |    |  x  |  -1  |  -1  | -1
> bos(p, 1)       |     |    | x  |  x  |  -1  |  -1  | -1
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p, 0)      |  x  |    |    |     | 1024 | 1024 | 1024
> bdos(p, 0)      |     | x  |    |     | 1024 | 1024 | 1024
> bdos(p, 0)      |     |    | x  |     |  -1  |  -1  | -1
> bdos(p, 0)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 0)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 0)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p, 1)      |  x  |    |    |     | 1024 | 1024 | 1024
> bdos(p, 1)      |     | x  |    |     | 1024 | 1024 | 1024
> bdos(p, 1)      |     |    | x  |     |  -1  |  -1  | -1
> bdos(p, 1)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 1)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 1)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> ----------------|-----|----|----|-----|------|------|-----
> bos(p->a_ent, 0)|  x  |    |    |     |  996 | 996  | 996
> bos(p->a_ent, 0)|     | x  |    |     |  -1  |  -1  | -1
> bos(p->a_ent, 0)|     |    | x  |     |  -1  |  -1  | -1
> bos(p->a_ent, 0)|  x  |    |    |  x  |  996 | 996  | 996
> bos(p->a_ent, 0)|     | x  |    |  x  |  -1  |  -1  | -1
> bos(p->a_ent, 0)|     |    | x  |  x  |  -1  |  -1  | -1
> ----------------|-----|----|----|-----|------|------|-----
> bos(p->a_ent, 1)|  x  |    |    |     |  996 | 996  | 992
> bos(p->a_ent, 1)|     | x  |    |     |  -1  |  -1  | -1
> bos(p->a_ent, 1)|     |    | x  |     |  -1  |  -1  | -1
> bos(p->a_ent, 1)|  x  |    |    |  x  |  996 | 996  | 992
> bos(p->a_ent, 1)|     | x  |    |  x  |  -1  |  -1  | -1
> bos(p->a_ent, 1)|     |    | x  |  x  |  -1  |  -1  | -1
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p->a_ent,0)|  x  |    |    |     |  996 | 996  | 996
> bdos(p->a_ent,0)|     | x  |    |     |  996 | 996  | 996
> bdos(p->a_ent,0)|     |    | x  |     |  -1  |  -1  | -1
> bdos(p->a_ent,0)|  x  |    |    |  x  |   8  |  8   |  8
> bdos(p->a_ent,0)|     | x  |    |  x  |   8  |  8   |  8
> bdos(p->a_ent,0)|     |    | x  |  x  |   8  |  8   |  8
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p->a_ent,1)|  x  |    |    |     |  996 | 996  | 992
> bdos(p->a_ent,1)|     | x  |    |     |  996 | 996  | 992
> bdos(p->a_ent,1)|     |    | x  |     |  -1  |  -1  | -1
> bdos(p->a_ent,1)|  x  |    |    |  x  |   8  |  8   |  8
> bdos(p->a_ent,1)|     | x  |    |  x  |   8  |  8   |  8
> bdos(p->a_ent,1)|     |    | x  |  x  |   8  |  8   |  8
> ----------------|-----|----|----|-----|------|------|-----
>
> bos only uses the allocation size to give it's answers. It only works if
> it is a compile time known constant. bos also does not utilize the
> __counted_by attribute.
>
> bdos on the other hand allows the allocation size to be runtime known.
> It also makes use of the __counted_by attribute if present, which always
> takes precedence over the allocation size when the compiler supports it
> for the particular case. So in those cases you can "lie" to the compiler
> about the size of an object.
>
> clang supports the __counted_by attribute for both cases (p and
> p->a_entries). gcc only supports it for p->a_entries cases.
>
>
>
> Issue A (clang)
> =======
>
> function        |comp.|run.|none|count| gcc  |clang |correct
> ----------------|-----|----|----|-----|------|------|-----
> bdos(p, 0)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 0)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 0)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
> bdos(p, 1)      |  x  |    |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 1)      |     | x  |    |  x  | 1024 |  36  | 43 / 40
> bdos(p, 1)      |     |    | x  |  x  |  -1  |  36  | 43 / 40
>
> These cases also represent the "bdos off by 4" issue in clang. clang
> will compute these results using:
>
> max(sizeof(struct posix_acl), offsetof(struct posix_acl, a_entries) +
> count * sizeof(struct posix_acl_entries)) = 36
>
> The kernel on the other hand expects this behavior:
>
> sizeof(struct posix_acl) + count * sizeof(struct posix_acl_entries) = 40
>
>
> I think the correct calculation would actually be this:
>
> offsetof(struct posix_acl, a_entries)
> + (acl->a_count + 1) * sizeof(struct posix_acl_entry) - 1 = 43
>
> The C11 standard says that when the . or -> operator is used on a struct
> with an FAM it behaves like the FAM was replaced with the largest array
> (with the same element type) that would not make the object any larger
> (see page 113 and 114 of [2]).
> So there are actually multiple sizes of the object that are consistent
> with a count of 1.
>
> malloc-max = maximum size of the object
> malloc-min = minimum size of the object
> FAME = flexible array member element
> (FAME) = hypothetical 2nd FAME
>
> <-----------------malloc-max-------------->
> <-----------------malloc-min------->
> <------sizeof(posix_acl)------->
>                             <-FAME-><(FAME)>
>
> The clang documentation of type 0 (vs type 2) bdos says this:
>
> If ``type & 2 == 0``, the least ``n`` is returned such that accesses to
>    ``(const char*)ptr + n`` and beyond are known to be out of bounds.
>
> We only _know_ that that access to the last byte of a 2nd hypothetical FAME
> would be out of bounds. All the bytes before that are padding that is
> allowed by the standard.
>
>
> However, also this calculation doesn't get the kernel out
> of trouble here. While this would fix the issue for this particular
> struct it does not solve it for all structs:
>
> What if the elements of the FAM were chars instead of
> struct posix_acl_entries here? In that case the kernel is back to
> overestimating the size of the struct / underreporting the count to the
> compiler. So while I think this answer is more correct it doesn't
> actually solve the issue.
>
> Example:
> Let's say the kernel allocates one of these posix_acl_char structs for a
> single char in the array:
>
> malloc(sizeof(posix_acl_char) + 1 * sizeof(char)) = 33
>
> The C standard actually says that this object will behave like this when
> the FAM is accessed:
>
> struct posix_acl {
>     refcount_t a_refcount;
>     struct rcu_head a_rcu;
>     unsigned int a_count;
>     char a_entries[5];
> };
>
> a_count should be set to 5, not 1!
>
While the standard says that it should act as 5 instead of 1 and that
it's not an error to access padding, the point of the __counted_by
attribute is to check that the user isn't doing anything "bad" by
going outside of whatever bounds have been put in place. So I wouldn't
want __bdos(p->a_entries, 0) to return 5 when the initial allocation
is for 1. It's confusing given the documentation for the attribute.

> So we would really need an option to tell the compiler to use the same
> size calculation as the kernel expects here, or maybe be able to specify
> an offset in the __counted_by attribute. Alternatively clang could use
> an option to disable the use of __counted_by for cases where the whole
> struct is passed. This would make it behave like gcc.
>
I would be in favor of disabling __bdos on a whole struct pointer if
it will match the functionality between the compilers. I don't think
Qing has that on her plate at the moment, but when / if she revisits
that we can discuss exactly how to perform the calculations then.

> Issue B (clang + gcc)
> =======
>
> A less serious issue happens with these cases:
>
> function        |comp.|run.|none|count| gcc  |clang |correct
> ----------------|-----|----|----|-----|------|------|-----
> bos(p->a_ent, 1)|  x  |    |    |     |  996 | 996  | 992
> bos(p->a_ent, 1)|  x  |    |    |  x  |  996 | 996  | 992
> bdos(p->a_ent,1)|  x  |    |    |     |  996 | 996  | 992
> bdos(p->a_ent,1)|     | x  |    |     |  996 | 996  | 992
>
> In this case the size returned by bos/bdos is too large, so this won't
> lead to false positives. Both clang and gcc simply compute the difference
> between the pointer from the start of the FAM to the end of the whole
> struct. I believe this is wrong. According to the C standard the object
> should behave like the FAM was replaced with the largest array that does
> not make the object any larger. The size of that array is 124 elements.
> So the posix_acl becomes:
>
I reported a similar issue to GCC a while back. The response is that
it's not incorrect, because the size is still valid (padding, etc.).
Their view is that, even when asked for the size of the subobject,
they want to return some value, even if it's larger than the
subobject, but not outside the bounds of the full object. I strongly
disagree that it's okay to do that, but I'm probably in the minority.
Clang's support for 1 as the __bdos's second argument isn't great. I'm
trying to fix it, but got sidetracked by higher priority issues.

So in conclusion, if turning off the calculation for a pointer to the
whole struct works, then I'm okay with it.

-bw

> struct posix_acl {
>     refcount_t a_refcount;
>     struct rcu_head a_rcu;
>     unsigned int a_count;
>     struct posix_acl_entry a_entries[124];
> };
>
> Since this is a type 1 bos/bdos it should return the size of just the
> array, which is 124 * 8 = 992, and not 124.5 * 8 = 996.
>
> [1] https://godbolt.org/z/a5eM3z8PY
> [2] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
>
> Best Regards
> Jan
>
Bill Wendling Oct. 17, 2024, 12:09 a.m. UTC | #40
On Wed, Oct 16, 2024 at 4:41 PM Bill Wendling <morbo@google.com> wrote:
>
> On Sun, Oct 6, 2024 at 8:56 PM Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > > I want to separate several easily confused issues. Instead of just
> > > saying __bdos, let's clearly refer to what calculation within bdos is
> > > being used. There are 3 choices currently:
> > > - alloc_size attribute
> > > - counted_by attribute
> > > - fallback to __bos (which is similar to sizeof(), except that FAMs are 0 sized)
> > >
> > > Additionally there are (for all intents and purposes) 2 size
> > > determinations to be made by __bos and __bdos, via argument 2:
> > > - containing object size (type 0) ("maximum size")
> > > - specific object size (type 1) ("minimum size")
> >
> > "maximum" vs "minimum" size would by type 0 vs type 2, but I think you
> > do mean type 0 and type 1 as those are the types currently used by
> > __struct_size and __member_size. Those are both "maximum" sizes.
> >
> > >
> > > For example, consider:
> > >
> > > struct posix_acl *acl = malloc(1024);
> > > acl->a_count = 1;
> > >
> > > what should these return:
> > >
> > >       __bos(acl, 0)
> > >       __bos(acl, 1)
> > >       __bdos(acl, 0)
> > >       __bdos(acl, 1)
> > >       __bos(acl->a_entries, 0)
> > >       __bos(acl->a_entries, 1)
> > >       __bdos(acl->a_entries, 0)
> > >       __bdos(acl->a_entries, 1)
> > >
> >
> Thank you for this detailed write-up! I'm sorry for my late response.
>
[snip]
>
> So in conclusion, if turning off the calculation for a pointer to the
> whole struct works, then I'm okay with it.
>
Here's a potential fix:

  https://github.com/llvm/llvm-project/pull/112636

-bw
Jan Hendrik Farr Oct. 17, 2024, 12:41 a.m. UTC | #41
> I would be in favor of disabling __bdos on a whole struct pointer if
> it will match the functionality between the compilers. I don't think
> Qing has that on her plate at the moment, but when / if she revisits
> that we can discuss exactly how to perform the calculations then.

That's a good approach from my perspective.

To get this done we would:

Now:
1. Disable the __counted_by attribute calculation in clang for whole
struct __bdos cases like in [1] and get this into the next clang point
release (19.1.3)

2. In the kernel, disable __counted_by for clang versions < 19.1.3. Also
backport that into the stable kernels

In the future:
3. Try and figure out what the correct counted_by calculation for whole
structs should be in conjunction with gcc and clang. Either provide an
option in clang and gcc to follow the kernels expectations or change
struct_size in the kernel to match gcc's and clang's future behavior.


[1] https://github.com/llvm/llvm-project/pull/112636

Best Regards
Jan
Jan Hendrik Farr Oct. 17, 2024, 3:04 a.m. UTC | #42
On 16 17:09:42, Bill Wendling wrote:
> On Wed, Oct 16, 2024 at 4:41 PM Bill Wendling <morbo@google.com> wrote:
> >
> > On Sun, Oct 6, 2024 at 8:56 PM Jan Hendrik Farr <kernel@jfarr.cc> wrote:
> > > > I want to separate several easily confused issues. Instead of just
> > > > saying __bdos, let's clearly refer to what calculation within bdos is
> > > > being used. There are 3 choices currently:
> > > > - alloc_size attribute
> > > > - counted_by attribute
> > > > - fallback to __bos (which is similar to sizeof(), except that FAMs are 0 sized)
> > > >
> > > > Additionally there are (for all intents and purposes) 2 size
> > > > determinations to be made by __bos and __bdos, via argument 2:
> > > > - containing object size (type 0) ("maximum size")
> > > > - specific object size (type 1) ("minimum size")
> > >
> > > "maximum" vs "minimum" size would by type 0 vs type 2, but I think you
> > > do mean type 0 and type 1 as those are the types currently used by
> > > __struct_size and __member_size. Those are both "maximum" sizes.
> > >
> > > >
> > > > For example, consider:
> > > >
> > > > struct posix_acl *acl = malloc(1024);
> > > > acl->a_count = 1;
> > > >
> > > > what should these return:
> > > >
> > > >       __bos(acl, 0)
> > > >       __bos(acl, 1)
> > > >       __bdos(acl, 0)
> > > >       __bdos(acl, 1)
> > > >       __bos(acl->a_entries, 0)
> > > >       __bos(acl->a_entries, 1)
> > > >       __bdos(acl->a_entries, 0)
> > > >       __bdos(acl->a_entries, 1)
> > > >
> > >
> > Thank you for this detailed write-up! I'm sorry for my late response.
> >
> [snip]
> >
> > So in conclusion, if turning off the calculation for a pointer to the
> > whole struct works, then I'm okay with it.
> >
> Here's a potential fix:
> 
>   https://github.com/llvm/llvm-project/pull/112636

Here's the patch to disable __counted_by for clang < 19.1.3. I'll submit
it properly when your PR is merged. I hope I got all the correct tags in
there as there were multiple reports of these issues. Let me know if
anything should be added, I'm new to the process.

From: Jan Hendrik Farr <kernel@jfarr.cc>
Date: Thu, 17 Oct 2024 04:39:40 +0200
Subject: [PATCH] Compiler Attributes: disable __counted_by for clang < 19.1.3

This patch disables __counted_by for clang versions < 19.1.3 because of
two issues.

1. clang versions < 19.1.2 have a bug that can lead to __bdos returning 0:
https://github.com/llvm/llvm-project/pull/110497

2. clang versions < 19.1.3 have a bug that can lead to __bdos being off by 4:
https://github.com/llvm/llvm-project/pull/112636

Cc: stable@vger.kernel.org
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202409260949.a1254989-oliver.sang@intel.com
Link: https://lore.kernel.org/all/Zw8iawAF5W2uzGuh@archlinux/T/#m204c09f63c076586a02d194b87dffc7e81b8de7b
Signed-off-by: Jan Hendrik Farr <kernel@jfarr.cc>
---
 include/linux/compiler_attributes.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index 32284cd26d52..7966a533aaec 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -100,8 +100,17 @@
  *
  *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
  * clang: https://github.com/llvm/llvm-project/pull/76348
+ *
+ * clang versions < 19.1.2 have a bug that can lead to __bdos returning 0:
+ * https://github.com/llvm/llvm-project/pull/110497
+ *
+ * clang versions < 19.1.3 have a bug that can lead to __bdos being off by 4:
+ * https://github.com/llvm/llvm-project/pull/112636
  */
-#if __has_attribute(__counted_by__)
+#if __has_attribute(__counted_by__) && \
+	(!defined(__clang__) || (__clang_major__ > 19) || \
+	(__clang_major__ == 19 && (__clang_minor__ > 1 || \
+	(__clang_minor__ == 1 && __clang_patchlevel__ >= 3))))
 # define __counted_by(member)		__attribute__((__counted_by__(member)))
 #else
 # define __counted_by(member)
Nathan Chancellor Oct. 17, 2024, 4:55 p.m. UTC | #43
Hi Jan,

On Thu, Oct 17, 2024 at 05:04:26AM +0200, Jan Hendrik Farr wrote:
> On 16 17:09:42, Bill Wendling wrote:
> > Here's a potential fix:
> > 
> >   https://github.com/llvm/llvm-project/pull/112636
> 
> Here's the patch to disable __counted_by for clang < 19.1.3. I'll submit
> it properly when your PR is merged. I hope I got all the correct tags in
> there as there were multiple reports of these issues. Let me know if
> anything should be added, I'm new to the process.
> 
> From: Jan Hendrik Farr <kernel@jfarr.cc>
> Date: Thu, 17 Oct 2024 04:39:40 +0200
> Subject: [PATCH] Compiler Attributes: disable __counted_by for clang < 19.1.3
> 
> This patch disables __counted_by for clang versions < 19.1.3 because of
> two issues.
> 
> 1. clang versions < 19.1.2 have a bug that can lead to __bdos returning 0:
> https://github.com/llvm/llvm-project/pull/110497
> 
> 2. clang versions < 19.1.3 have a bug that can lead to __bdos being off by 4:
> https://github.com/llvm/llvm-project/pull/112636
> 
> Cc: stable@vger.kernel.org

Should this include a Fixes tag to give the stable folks a hint about
how far back this should go? Maybe

Fixes: c8248faf3ca2 ("Compiler Attributes: counted_by: Adjust name and identifier expansion")

It won't pick clean without 16c31dd7fdf6 or 2993eb7a8d34 but those are
easy enough to apply before taking this one.

> Reported-by: Nathan Chancellor <nathan@kernel.org>
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202409260949.a1254989-oliver.sang@intel.com
> Link: https://lore.kernel.org/all/Zw8iawAF5W2uzGuh@archlinux/T/#m204c09f63c076586a02d194b87dffc7e81b8de7b
> Signed-off-by: Jan Hendrik Farr <kernel@jfarr.cc>

Thanks for all of your help driving getting this fixed. The commit
message looks good to me aside my small nit above. I do have a
suggestion on the actual patch itself.

> ---
>  include/linux/compiler_attributes.h | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
> index 32284cd26d52..7966a533aaec 100644
> --- a/include/linux/compiler_attributes.h
> +++ b/include/linux/compiler_attributes.h
> @@ -100,8 +100,17 @@
>   *
>   *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
>   * clang: https://github.com/llvm/llvm-project/pull/76348
> + *
> + * clang versions < 19.1.2 have a bug that can lead to __bdos returning 0:
> + * https://github.com/llvm/llvm-project/pull/110497
> + *
> + * clang versions < 19.1.3 have a bug that can lead to __bdos being off by 4:
> + * https://github.com/llvm/llvm-project/pull/112636
>   */
> -#if __has_attribute(__counted_by__)
> +#if __has_attribute(__counted_by__) && \
> +	(!defined(__clang__) || (__clang_major__ > 19) || \
> +	(__clang_major__ == 19 && (__clang_minor__ > 1 || \
> +	(__clang_minor__ == 1 && __clang_patchlevel__ >= 3))))
>  # define __counted_by(member)		__attribute__((__counted_by__(member)))
>  #else
>  # define __counted_by(member)
> -- 
> 2.47.0
> 

compiler_attributes.h is intended to be free from compiler and version
checks, so adding a version check means that __counted_by() needs to be
moved into compiler_types.h. This might be a good opportunity to
introduce something like CC_HAS_COUNTED_BY in Kconfig, so that we can
keep the checks unified (since there are already multiple places that
want to know about __counted_by support for the sake of testing) and
adjust versions like this easily in the future if something else comes
up, especially since __counted_by() is not available in a released GCC
version yet.

Perhaps something like this? Feel free to take it wholesale if you would
like or tweak it however you see fit.

Cheers,
Nathan

diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
index 62ba01525479..376047beea3d 100644
--- a/drivers/misc/lkdtm/bugs.c
+++ b/drivers/misc/lkdtm/bugs.c
@@ -445,7 +445,7 @@ static void lkdtm_FAM_BOUNDS(void)
 
 	pr_err("FAIL: survived access of invalid flexible array member index!\n");
 
-	if (!__has_attribute(__counted_by__))
+	if (!IS_ENABLED(CONFIG_CC_HAS_COUNTED_BY))
 		pr_warn("This is expected since this %s was built with a compiler that does not support __counted_by\n",
 			lkdtm_kernel_info);
 	else if (IS_ENABLED(CONFIG_UBSAN_BOUNDS))
diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index 32284cd26d52..c16d4199bf92 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -94,19 +94,6 @@
 # define __copy(symbol)
 #endif
 
-/*
- * Optional: only supported since gcc >= 15
- * Optional: only supported since clang >= 18
- *
- *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
- * clang: https://github.com/llvm/llvm-project/pull/76348
- */
-#if __has_attribute(__counted_by__)
-# define __counted_by(member)		__attribute__((__counted_by__(member)))
-#else
-# define __counted_by(member)
-#endif
-
 /*
  * Optional: not supported by gcc
  * Optional: only supported since clang >= 14.0
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 1a957ea2f4fe..639be0f30b45 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -323,6 +323,25 @@ struct ftrace_likely_data {
 #define __no_sanitize_or_inline __always_inline
 #endif
 
+/*
+ * Optional: only supported since gcc >= 15
+ * Optional: only supported since clang >= 18
+ *
+ *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
+ * clang: https://github.com/llvm/llvm-project/pull/76348
+ *
+ * __bdos on clang < 19.1.2 can erroneously return 0:
+ * https://github.com/llvm/llvm-project/pull/110497
+ *
+ * __bdos on clang < 19.1.3 can be off by 4:
+ * https://github.com/llvm/llvm-project/pull/112636
+ */
+#ifdef CONFIG_CC_HAS_COUNTED_BY
+# define __counted_by(member)		__attribute__((__counted_by__(member)))
+#else
+# define __counted_by(member)
+#endif
+
 /*
  * Apply __counted_by() when the Endianness matches to increase test coverage.
  */
diff --git a/init/Kconfig b/init/Kconfig
index 1aa95a5dfff8..6da1a8c3d99d 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -120,6 +120,13 @@ config CC_HAS_ASM_INLINE
 config CC_HAS_NO_PROFILE_FN_ATTR
 	def_bool $(success,echo '__attribute__((no_profile_instrument_function)) int x();' | $(CC) -x c - -c -o /dev/null -Werror)
 
+config CC_HAS_COUNTED_BY
+	def_bool $(success,echo 'struct flex { int count; int array[] __attribute__((__counted_by__(count))); };' | $(CC) $(CLANG_FLAGS) -x c - -c -o /dev/null -Werror)
+	# clang needs to be at least 19.1.3 to avoid __bdos miscalculations
+	# https://github.com/llvm/llvm-project/pull/110497
+	# https://github.com/llvm/llvm-project/pull/112636
+	depends on CC_IS_GCC || CLANG_VERSION >= 190103
+
 config PAHOLE_VERSION
 	int
 	default $(shell,$(srctree)/scripts/pahole-version.sh $(PAHOLE))
diff --git a/lib/overflow_kunit.c b/lib/overflow_kunit.c
index 2abc78367dd1..5222c6393f11 100644
--- a/lib/overflow_kunit.c
+++ b/lib/overflow_kunit.c
@@ -1187,7 +1187,7 @@ static void DEFINE_FLEX_test(struct kunit *test)
 {
 	/* Using _RAW_ on a __counted_by struct will initialize "counter" to zero */
 	DEFINE_RAW_FLEX(struct foo, two_but_zero, array, 2);
-#if __has_attribute(__counted_by__)
+#ifdef CONFIG_CC_HAS_COUNTED_BY
 	int expected_raw_size = sizeof(struct foo);
 #else
 	int expected_raw_size = sizeof(struct foo) + 2 * sizeof(s16);
Miguel Ojeda Oct. 17, 2024, 5:39 p.m. UTC | #44
On Thu, Oct 17, 2024 at 6:55 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> Should this include a Fixes tag to give the stable folks a hint about
> how far back this should go? Maybe
>
> Fixes: c8248faf3ca2 ("Compiler Attributes: counted_by: Adjust name and identifier expansion")

Yeah, I am not sure -- it does not really fix that commit, but if it
helps the stable team...

> compiler_attributes.h is intended to be free from compiler and version
> checks, so adding a version check means that __counted_by() needs to be

Yeah, ideally we should avoid that since the goal was to have a file
with the straightforward ones.

Though if we do go for `CC_HAS_*`, I guess it would be simple enough
too, i.e. similar to `has_attribute` (but on our side), but it also
loses the simplicity of knowing those do not have arbitrarily complex
conditions which `CC_HAS_*` could hide.

> moved into compiler_types.h. This might be a good opportunity to
> introduce something like CC_HAS_COUNTED_BY in Kconfig, so that we can
> keep the checks unified (since there are already multiple places that
> want to know about __counted_by support for the sake of testing) and
> adjust versions like this easily in the future if something else comes
> up, especially since __counted_by() is not available in a released GCC
> version yet.

Sounds good to me (even if we did the unification somewhere else).
Using `CLANG_VERSION` looks better too.

> +config CC_HAS_COUNTED_BY
> +       def_bool $(success,echo 'struct flex { int count; int array[] __attribute__((__counted_by__(count))); };' | $(CC) $(CLANG_FLAGS) -x c - -c -o /dev/null -Werror)

I am probably missing some context, but what is the reason for the
build test? i.e. is there a reason we cannot test the GCC version too?
If the reason it is that it is not released, should we change it
later?

Thanks! (and for the Cc).

Cheers,
Miguel
Nathan Chancellor Oct. 17, 2024, 6:55 p.m. UTC | #45
Hi Miguel,

On Thu, Oct 17, 2024 at 07:39:43PM +0200, Miguel Ojeda wrote:
> On Thu, Oct 17, 2024 at 6:55 PM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > Should this include a Fixes tag to give the stable folks a hint about
> > how far back this should go? Maybe
> >
> > Fixes: c8248faf3ca2 ("Compiler Attributes: counted_by: Adjust name and identifier expansion")
> 
> Yeah, I am not sure -- it does not really fix that commit, but if it
> helps the stable team...

The most "correct" Fixes tag would appear to be the one that first
introduced __counted_by itself (dd06e72e68bc) but __counted_by can never
be used at that original change because the test used __element_count__
as the attribute name, which never shipped in any compiler. So I would
argue that this change really does fix c8248faf3ca2 because that is the
point in time that needs this fix.

> > compiler_attributes.h is intended to be free from compiler and version
> > checks, so adding a version check means that __counted_by() needs to be
> 
> Yeah, ideally we should avoid that since the goal was to have a file
> with the straightforward ones.
> 
> Though if we do go for `CC_HAS_*`, I guess it would be simple enough
> too, i.e. similar to `has_attribute` (but on our side), but it also
> loses the simplicity of knowing those do not have arbitrarily complex
> conditions which `CC_HAS_*` could hide.

Yeah, I think the way compiler_attributes.h has operated so far with
regards to tests and such is working fine so far, no real need to switch
things up yet.

> > moved into compiler_types.h. This might be a good opportunity to
> > introduce something like CC_HAS_COUNTED_BY in Kconfig, so that we can
> > keep the checks unified (since there are already multiple places that
> > want to know about __counted_by support for the sake of testing) and
> > adjust versions like this easily in the future if something else comes
> > up, especially since __counted_by() is not available in a released GCC
> > version yet.
> 
> Sounds good to me (even if we did the unification somewhere else).
> Using `CLANG_VERSION` looks better too.
> 
> > +config CC_HAS_COUNTED_BY
> > +       def_bool $(success,echo 'struct flex { int count; int array[] __attribute__((__counted_by__(count))); };' | $(CC) $(CLANG_FLAGS) -x c - -c -o /dev/null -Werror)
> 
> I am probably missing some context, but what is the reason for the
> build test? i.e. is there a reason we cannot test the GCC version too?

The only reason I generally do build tests plus compiler version checks
is due to certain vendor versions of clang, such as Android, which may
hijack the patch version and appear newer than they actually are.
However, now that I think about it, LLVM moving to GCC's minor
versioning scheme of 0 for the development / main branch and 1+ for
released versions should avoid that issue, so it isn't strictly
necessary for that reason. However...

> If the reason it is that it is not released, should we change it
> later?

This is a good point, as technically to allow use of __counted_by with
GCC with a version check, it would need to be 150000, which would
potentially break GCC versions between the 15 version bump and landing
__counted_by support without the feature check. We could also just do
150100 to be simple about it but I am not sure that is worth doing,
since I believe it is important that we support using __counted_by with
prerelease GCC. We want to make sure that this attribute gets decent
testing coverage while in development.

We could ship this with a comment to simplify the check when GCC 15.1.0
is released, since this is a feature very unlikely to be backported to
earlier GCC releases?

Cheers,
Nathan


I guess since we have a hard version check for clang, we might as well
have one for GCC as well.
Miguel Ojeda Oct. 18, 2024, 11:52 a.m. UTC | #46
On Thu, Oct 17, 2024 at 8:55 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> The most "correct" Fixes tag would appear to be the one that first
> introduced __counted_by itself (dd06e72e68bc) but __counted_by can never
> be used at that original change because the test used __element_count__
> as the attribute name, which never shipped in any compiler. So I would
> argue that this change really does fix c8248faf3ca2 because that is the
> point in time that needs this fix.

That is fair, you are right.

> This is a good point, as technically to allow use of __counted_by with
> GCC with a version check, it would need to be 150000, which would
> potentially break GCC versions between the 15 version bump and landing
> __counted_by support without the feature check. We could also just do
> 150100 to be simple about it but I am not sure that is worth doing,
> since I believe it is important that we support using __counted_by with
> prerelease GCC. We want to make sure that this attribute gets decent
> testing coverage while in development.
>
> We could ship this with a comment to simplify the check when GCC 15.1.0
> is released, since this is a feature very unlikely to be backported to
> earlier GCC releases?

Thanks for the clear explanation! Yeah, if older not-yet-released GCC
15s are important for some people, then I think it is fair to have the
build test for the time being.

Cheers,
Miguel
Jan Hendrik Farr Oct. 21, 2024, 1:33 a.m. UTC | #47
On 17 09:55:22, Nathan Chancellor wrote:

Hi Nathan,

Thanks for the feedback.

> Hi Jan,
> 
> On Thu, Oct 17, 2024 at 05:04:26AM +0200, Jan Hendrik Farr wrote:
> > On 16 17:09:42, Bill Wendling wrote:
> > > Here's a potential fix:
> > > 
> > >   https://github.com/llvm/llvm-project/pull/112636
> > 
> > Here's the patch to disable __counted_by for clang < 19.1.3. I'll submit
> > it properly when your PR is merged. I hope I got all the correct tags in
> > there as there were multiple reports of these issues. Let me know if
> > anything should be added, I'm new to the process.
> > 
> > From: Jan Hendrik Farr <kernel@jfarr.cc>
> > Date: Thu, 17 Oct 2024 04:39:40 +0200
> > Subject: [PATCH] Compiler Attributes: disable __counted_by for clang < 19.1.3
> > 
> > This patch disables __counted_by for clang versions < 19.1.3 because of
> > two issues.
> > 
> > 1. clang versions < 19.1.2 have a bug that can lead to __bdos returning 0:
> > https://github.com/llvm/llvm-project/pull/110497
> > 
> > 2. clang versions < 19.1.3 have a bug that can lead to __bdos being off by 4:
> > https://github.com/llvm/llvm-project/pull/112636
> > 
> > Cc: stable@vger.kernel.org
> 
> Should this include a Fixes tag to give the stable folks a hint about
> how far back this should go? Maybe
> 
> Fixes: c8248faf3ca2 ("Compiler Attributes: counted_by: Adjust name and identifier expansion")
> 
> It won't pick clean without 16c31dd7fdf6 or 2993eb7a8d34 but those are
> easy enough to apply before taking this one.

Yes, I'll add this. I agree that c8248faf3ca2 is the correct commit for
the Fixes tag, as this fix is not needed before this commit.

> 
> > Reported-by: Nathan Chancellor <nathan@kernel.org>
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > Closes: https://lore.kernel.org/oe-lkp/202409260949.a1254989-oliver.sang@intel.com
> > Link: https://lore.kernel.org/all/Zw8iawAF5W2uzGuh@archlinux/T/#m204c09f63c076586a02d194b87dffc7e81b8de7b
> > Signed-off-by: Jan Hendrik Farr <kernel@jfarr.cc>
> 
> Thanks for all of your help driving getting this fixed. The commit
> message looks good to me aside my small nit above. I do have a
> suggestion on the actual patch itself.
> 
> > ---
> >  include/linux/compiler_attributes.h | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
> > index 32284cd26d52..7966a533aaec 100644
> > --- a/include/linux/compiler_attributes.h
> > +++ b/include/linux/compiler_attributes.h
> > @@ -100,8 +100,17 @@
> >   *
> >   *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
> >   * clang: https://github.com/llvm/llvm-project/pull/76348
> > + *
> > + * clang versions < 19.1.2 have a bug that can lead to __bdos returning 0:
> > + * https://github.com/llvm/llvm-project/pull/110497
> > + *
> > + * clang versions < 19.1.3 have a bug that can lead to __bdos being off by 4:
> > + * https://github.com/llvm/llvm-project/pull/112636
> >   */
> > -#if __has_attribute(__counted_by__)
> > +#if __has_attribute(__counted_by__) && \
> > +	(!defined(__clang__) || (__clang_major__ > 19) || \
> > +	(__clang_major__ == 19 && (__clang_minor__ > 1 || \
> > +	(__clang_minor__ == 1 && __clang_patchlevel__ >= 3))))
> >  # define __counted_by(member)		__attribute__((__counted_by__(member)))
> >  #else
> >  # define __counted_by(member)
> > -- 
> > 2.47.0
> > 
> 
> compiler_attributes.h is intended to be free from compiler and version
> checks, so adding a version check means that __counted_by() needs to be
> moved into compiler_types.h. This might be a good opportunity to
> introduce something like CC_HAS_COUNTED_BY in Kconfig, so that we can
> keep the checks unified (since there are already multiple places that
> want to know about __counted_by support for the sake of testing) and
> adjust versions like this easily in the future if something else comes
> up, especially since __counted_by() is not available in a released GCC
> version yet.
> 
> Perhaps something like this? Feel free to take it wholesale if you would
> like or tweak it however you see fit.

Thanks, I only have one tweak below:

> 
> Cheers,
> Nathan
> 
> diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
> index 62ba01525479..376047beea3d 100644
> --- a/drivers/misc/lkdtm/bugs.c
> +++ b/drivers/misc/lkdtm/bugs.c
> @@ -445,7 +445,7 @@ static void lkdtm_FAM_BOUNDS(void)
>  
>  	pr_err("FAIL: survived access of invalid flexible array member index!\n");
>  
> -	if (!__has_attribute(__counted_by__))
> +	if (!IS_ENABLED(CONFIG_CC_HAS_COUNTED_BY))
>  		pr_warn("This is expected since this %s was built with a compiler that does not support __counted_by\n",
>  			lkdtm_kernel_info);
>  	else if (IS_ENABLED(CONFIG_UBSAN_BOUNDS))
> diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
> index 32284cd26d52..c16d4199bf92 100644
> --- a/include/linux/compiler_attributes.h
> +++ b/include/linux/compiler_attributes.h
> @@ -94,19 +94,6 @@
>  # define __copy(symbol)
>  #endif
>  
> -/*
> - * Optional: only supported since gcc >= 15
> - * Optional: only supported since clang >= 18
> - *
> - *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
> - * clang: https://github.com/llvm/llvm-project/pull/76348
> - */
> -#if __has_attribute(__counted_by__)
> -# define __counted_by(member)		__attribute__((__counted_by__(member)))
> -#else
> -# define __counted_by(member)
> -#endif
> -
>  /*
>   * Optional: not supported by gcc
>   * Optional: only supported since clang >= 14.0
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index 1a957ea2f4fe..639be0f30b45 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -323,6 +323,25 @@ struct ftrace_likely_data {
>  #define __no_sanitize_or_inline __always_inline
>  #endif
>  
> +/*
> + * Optional: only supported since gcc >= 15
> + * Optional: only supported since clang >= 18
> + *
> + *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
> + * clang: https://github.com/llvm/llvm-project/pull/76348
> + *
> + * __bdos on clang < 19.1.2 can erroneously return 0:
> + * https://github.com/llvm/llvm-project/pull/110497
> + *
> + * __bdos on clang < 19.1.3 can be off by 4:
> + * https://github.com/llvm/llvm-project/pull/112636
> + */
> +#ifdef CONFIG_CC_HAS_COUNTED_BY
> +# define __counted_by(member)		__attribute__((__counted_by__(member)))
> +#else
> +# define __counted_by(member)
> +#endif
> +
>  /*
>   * Apply __counted_by() when the Endianness matches to increase test coverage.
>   */
> diff --git a/init/Kconfig b/init/Kconfig
> index 1aa95a5dfff8..6da1a8c3d99d 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -120,6 +120,13 @@ config CC_HAS_ASM_INLINE
>  config CC_HAS_NO_PROFILE_FN_ATTR
>  	def_bool $(success,echo '__attribute__((no_profile_instrument_function)) int x();' | $(CC) -x c - -c -o /dev/null -Werror)
>  
> +config CC_HAS_COUNTED_BY
> +	def_bool $(success,echo 'struct flex { int count; int array[] __attribute__((__counted_by__(count))); };' | $(CC) $(CLANG_FLAGS) -x c - -c -o /dev/null -Werror)
> +	# clang needs to be at least 19.1.3 to avoid __bdos miscalculations
> +	# https://github.com/llvm/llvm-project/pull/110497
> +	# https://github.com/llvm/llvm-project/pull/112636
> +	depends on CC_IS_GCC || CLANG_VERSION >= 190103

I think I prefer

	depends on !(CC_IS_CLANG && CLANG_VERSION < 190103)

to make it more clear that the purpose is to disable this for clang
versions below 19.1.3, but keep it enabled for every other compiler
including pre-release gcc versions that pass the compile test.

Also after gcc 15 is released I don't think a version check for gcc
should be necessary. I only see an explicit version check as required
when we know a certain version is broken. Otherwise I would prefer using
the build test.


I guess an alternative would be to just create a
CC_COUNTED_BY_BROKEN that is enabled for clang versions below 19.1.3
and continue to use __has_attribute together with that option. That
would make the build test unnecesarry. The downside is that it
will require checking both __has_attribute and
CONFIG_CC_COUNTED_BY_BROKEN for __counted_by support. So I think
CC_HAS_COUNTED_BY is better.

> +
>  config PAHOLE_VERSION
>  	int
>  	default $(shell,$(srctree)/scripts/pahole-version.sh $(PAHOLE))
> diff --git a/lib/overflow_kunit.c b/lib/overflow_kunit.c
> index 2abc78367dd1..5222c6393f11 100644
> --- a/lib/overflow_kunit.c
> +++ b/lib/overflow_kunit.c
> @@ -1187,7 +1187,7 @@ static void DEFINE_FLEX_test(struct kunit *test)
>  {
>  	/* Using _RAW_ on a __counted_by struct will initialize "counter" to zero */
>  	DEFINE_RAW_FLEX(struct foo, two_but_zero, array, 2);
> -#if __has_attribute(__counted_by__)
> +#ifdef CONFIG_CC_HAS_COUNTED_BY
>  	int expected_raw_size = sizeof(struct foo);
>  #else
>  	int expected_raw_size = sizeof(struct foo) + 2 * sizeof(s16);

I'll submit it once Bill's fix is in the release/19.x branch. Which
maintainer should I address this too? You (Nathan), Miguel, Kees, or
someone else?


Best Regards
Jan
Miguel Ojeda Oct. 21, 2024, 6:04 a.m. UTC | #48
On Mon, Oct 21, 2024 at 3:33 AM Jan Hendrik Farr <kernel@jfarr.cc> wrote:
>
> I think I prefer
>
>         depends on !(CC_IS_CLANG && CLANG_VERSION < 190103)
>
> to make it more clear that the purpose is to disable this for clang
> versions below 19.1.3, but keep it enabled for every other compiler
> including pre-release gcc versions that pass the compile test.

Do we want other tooling to see the attribute? i.e. if the build check
gets removed, then that `depends on` would mean other tooling would
see it, right?

> Also after gcc 15 is released I don't think a version check for gcc
> should be necessary. I only see an explicit version check as required
> when we know a certain version is broken. Otherwise I would prefer using
> the build test.

Yeah, build tests are nice, although they require spawning a process
and so on, which (as far as I understand) we try to minimize. Version
checks also have the advantage that it is easy to remember/check when
we can remove the checks themselves when we upgrade the minimum
versions.

> I guess an alternative would be to just create a
> CC_COUNTED_BY_BROKEN that is enabled for clang versions below 19.1.3
> and continue to use __has_attribute together with that option. That
> would make the build test unnecesarry. The downside is that it
> will require checking both __has_attribute and
> CONFIG_CC_COUNTED_BY_BROKEN for __counted_by support. So I think
> CC_HAS_COUNTED_BY is better.

Yeah, if we are going to need a new Kconfig symbol anyway, then let's
make that the only thing to check. Otherwise we are in the "worst of
both worlds", I would say.

> I'll submit it once Bill's fix is in the release/19.x branch. Which
> maintainer should I address this too? You (Nathan), Miguel, Kees, or
> someone else?

Sounds good -- if you want, you can send it to all of us and we can
figure that out later.

Thanks!

Cheers,
Miguel
Jan Hendrik Farr Oct. 21, 2024, 5:01 p.m. UTC | #49
On 21 08:04:03, Miguel Ojeda wrote:
> > Also after gcc 15 is released I don't think a version check for gcc
> > should be necessary. I only see an explicit version check as required
> > when we know a certain version is broken. Otherwise I would prefer using
> > the build test.
> 
> Yeah, build tests are nice, although they require spawning a process
> and so on, which (as far as I understand) we try to minimize. Version
> checks also have the advantage that it is easy to remember/check when
> we can remove the checks themselves when we upgrade the minimum
> versions.
> 

If the goal is to minimize the need for build tests, I think we should
go with Nathan's suggestion of keeping the build test for now (to
support pre-release gcc versions) and remove it and just go with
versions checks for both gcc and clang once gcc 15 is released.

Best Regards
Jan
Nathan Chancellor Oct. 21, 2024, 7:25 p.m. UTC | #50
On Mon, Oct 21, 2024 at 03:33:36AM +0200, Jan Hendrik Farr wrote:
> > +config CC_HAS_COUNTED_BY
> > +	def_bool $(success,echo 'struct flex { int count; int array[] __attribute__((__counted_by__(count))); };' | $(CC) $(CLANG_FLAGS) -x c - -c -o /dev/null -Werror)
> > +	# clang needs to be at least 19.1.3 to avoid __bdos miscalculations
> > +	# https://github.com/llvm/llvm-project/pull/110497
> > +	# https://github.com/llvm/llvm-project/pull/112636
> > +	depends on CC_IS_GCC || CLANG_VERSION >= 190103
> 
> I think I prefer
> 
> 	depends on !(CC_IS_CLANG && CLANG_VERSION < 190103)
> 
> to make it more clear that the purpose is to disable this for clang
> versions below 19.1.3, but keep it enabled for every other compiler
> including pre-release gcc versions that pass the compile test.

Sure, that's a reasonable tweak to keep it a little bit more concise and
to the point. It's obviously logically equivalent.

> Also after gcc 15 is released I don't think a version check for gcc
> should be necessary. I only see an explicit version check as required
> when we know a certain version is broken. Otherwise I would prefer using
> the build test.

Yeah, I think this mostly got addressed with the comments downthread, I
think we are all in agreement.

> I guess an alternative would be to just create a
> CC_COUNTED_BY_BROKEN that is enabled for clang versions below 19.1.3
> and continue to use __has_attribute together with that option. That
> would make the build test unnecesarry. The downside is that it
> will require checking both __has_attribute and
> CONFIG_CC_COUNTED_BY_BROKEN for __counted_by support. So I think
> CC_HAS_COUNTED_BY is better.

Yeah I thought about something like that briefly but came to the same
conclusion quickly, especially once I realized how many places were
using __has_attribute for __counted_by already.

> I'll submit it once Bill's fix is in the release/19.x branch. Which
> maintainer should I address this too? You (Nathan), Miguel, Kees, or
> someone else?

Like Miguel said, you can send it to all the people you have mentioned
here but I would probably expect Kees to chauffeur this to Linus with
Miguel's Ack for compiler_attributes.h since Kees has generally owned
__counted_by up until this point.

Cheers,
Nathan
Jan Hendrik Farr Oct. 24, 2024, 1:16 p.m. UTC | #51
Hi Nathan,

Do you want me to add a Co-Developed-by tag for you? I feel bad just
taking it.

For reference here is the current state of the patch, still waiting on
the merge into clang 19.1.x:

It needs three prerequisite commits on top of 6.6.x, but unfortunately
still requires a small amount of manual conflict resolution, but it's
easy enough

1. include/linux/compiler_types.h:
	use the incoming change until before (but not including) the
	"Apply __counted_by() when the Endianness matches to increase test coverage."
	comment)

2. lib/overflow_kunit.c: 
	HEAD is correct


From 6c667a43af0c57cd3f260fd75d5c4a198ba94220 Mon Sep 17 00:00:00 2001
From: Jan Hendrik Farr <kernel@jfarr.cc>
Date: Thu, 17 Oct 2024 04:39:40 +0200
Subject: [PATCH] Compiler Attributes: disable __counted_by for clang < 19.1.3

This patch disables __counted_by for clang versions < 19.1.3 because
of the two issues listed below. It does this by introducing
CONFIG_CC_HAS_COUNTED_BY.

1. clang < 19.1.2 has a bug that can lead to __bdos returning 0:
https://github.com/llvm/llvm-project/pull/110497

2. clang < 19.1.3 has a bug that can lead to __bdos being off by 4:
https://github.com/llvm/llvm-project/pull/112636

Fixes: c8248faf3ca2 ("Compiler Attributes: counted_by: Adjust name and identifier expansion")
Cc: stable@vger.kernel.org # 6.6.x: 16c31dd7fdf6: Compiler Attributes: counted_by: bump min gcc version
Cc: stable@vger.kernel.org # 6.6.x: 2993eb7a8d34: Compiler Attributes: counted_by: fixup clang URL
Cc: stable@vger.kernel.org # 6.6.x: 231dc3f0c936: lkdtm/bugs: Improve warning message for compilers without counted_by support
Cc: stable@vger.kernel.org # 6.6.x
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202409260949.a1254989-oliver.sang@intel.com
Link: https://lore.kernel.org/all/Zw8iawAF5W2uzGuh@archlinux/T/#m204c09f63c076586a02d194b87dffc7e81b8de7b
Signed-off-by: Jan Hendrik Farr <kernel@jfarr.cc>
---
 drivers/misc/lkdtm/bugs.c           |  2 +-
 include/linux/compiler_attributes.h | 13 -------------
 include/linux/compiler_types.h      | 19 +++++++++++++++++++
 init/Kconfig                        |  8 ++++++++
 lib/overflow_kunit.c                |  2 +-
 5 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
index 62ba01525479..376047beea3d 100644
--- a/drivers/misc/lkdtm/bugs.c
+++ b/drivers/misc/lkdtm/bugs.c
@@ -445,7 +445,7 @@ static void lkdtm_FAM_BOUNDS(void)
 
 	pr_err("FAIL: survived access of invalid flexible array member index!\n");
 
-	if (!__has_attribute(__counted_by__))
+	if (!IS_ENABLED(CONFIG_CC_HAS_COUNTED_BY))
 		pr_warn("This is expected since this %s was built with a compiler that does not support __counted_by\n",
 			lkdtm_kernel_info);
 	else if (IS_ENABLED(CONFIG_UBSAN_BOUNDS))
diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index 32284cd26d52..c16d4199bf92 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -94,19 +94,6 @@
 # define __copy(symbol)
 #endif
 
-/*
- * Optional: only supported since gcc >= 15
- * Optional: only supported since clang >= 18
- *
- *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
- * clang: https://github.com/llvm/llvm-project/pull/76348
- */
-#if __has_attribute(__counted_by__)
-# define __counted_by(member)		__attribute__((__counted_by__(member)))
-#else
-# define __counted_by(member)
-#endif
-
 /*
  * Optional: not supported by gcc
  * Optional: only supported since clang >= 14.0
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 1a957ea2f4fe..639be0f30b45 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -323,6 +323,25 @@ struct ftrace_likely_data {
 #define __no_sanitize_or_inline __always_inline
 #endif
 
+/*
+ * Optional: only supported since gcc >= 15
+ * Optional: only supported since clang >= 18
+ *
+ *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
+ * clang: https://github.com/llvm/llvm-project/pull/76348
+ *
+ * __bdos on clang < 19.1.2 can erroneously return 0:
+ * https://github.com/llvm/llvm-project/pull/110497
+ *
+ * __bdos on clang < 19.1.3 can be off by 4:
+ * https://github.com/llvm/llvm-project/pull/112636
+ */
+#ifdef CONFIG_CC_HAS_COUNTED_BY
+# define __counted_by(member)		__attribute__((__counted_by__(member)))
+#else
+# define __counted_by(member)
+#endif
+
 /*
  * Apply __counted_by() when the Endianness matches to increase test coverage.
  */
diff --git a/init/Kconfig b/init/Kconfig
index 530a382ee0fe..5f1fe3583f20 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -116,6 +116,14 @@ config CC_HAS_ASM_INLINE
 config CC_HAS_NO_PROFILE_FN_ATTR
 	def_bool $(success,echo '__attribute__((no_profile_instrument_function)) int x();' | $(CC) -x c - -c -o /dev/null -Werror)
 
+# clang needs to be at least 19.1.3 to avoid __bdos miscalculations
+# https://github.com/llvm/llvm-project/pull/110497
+# https://github.com/llvm/llvm-project/pull/112636
+# TODO: when gcc 15 is released remove the build test and add gcc version check
+config CC_HAS_COUNTED_BY
+	def_bool $(success,echo 'struct flex { int count; int array[] __attribute__((__counted_by__(count))); };' | $(CC) $(CLANG_FLAGS) -x c - -c -o /dev/null -Werror)
+	depends on !(CC_IS_CLANG && CLANG_VERSION < 190103)
+
 config PAHOLE_VERSION
 	int
 	default $(shell,$(srctree)/scripts/pahole-version.sh $(PAHOLE))
diff --git a/lib/overflow_kunit.c b/lib/overflow_kunit.c
index 2abc78367dd1..5222c6393f11 100644
--- a/lib/overflow_kunit.c
+++ b/lib/overflow_kunit.c
@@ -1187,7 +1187,7 @@ static void DEFINE_FLEX_test(struct kunit *test)
 {
 	/* Using _RAW_ on a __counted_by struct will initialize "counter" to zero */
 	DEFINE_RAW_FLEX(struct foo, two_but_zero, array, 2);
-#if __has_attribute(__counted_by__)
+#ifdef CONFIG_CC_HAS_COUNTED_BY
 	int expected_raw_size = sizeof(struct foo);
 #else
 	int expected_raw_size = sizeof(struct foo) + 2 * sizeof(s16);
Nathan Chancellor Oct. 25, 2024, 1:15 a.m. UTC | #52
Hi Jan,

On Thu, Oct 24, 2024 at 03:16:50PM +0200, Jan Hendrik Farr wrote:
> Do you want me to add a Co-Developed-by tag for you? I feel bad just
> taking it.

I would not be opposed to a Co-developed-by tag since it reflects the
collaborative nature of the change but I do not need it just for the
sake of credit because you have done a good amount of work analyzing and
driving getting this problem resolved. I would argue I just
"Kconfig-ified" your proposed change :) So you have my permission to add
one but I will not be offended with just the Suggested-by!

> For reference here is the current state of the patch, still waiting on
> the merge into clang 19.1.x:
> 
> It needs three prerequisite commits on top of 6.6.x, but unfortunately
> still requires a small amount of manual conflict resolution, but it's
> easy enough
> 
> 1. include/linux/compiler_types.h:
> 	use the incoming change until before (but not including) the
> 	"Apply __counted_by() when the Endianness matches to increase test coverage."
> 	comment)
> 
> 2. lib/overflow_kunit.c: 
> 	HEAD is correct

Good to know. If they cannot resolve the conflicts, we'll get notified
that it has failed to apply so you (or one of us) can send a massaged
backport as a reply.

> From 6c667a43af0c57cd3f260fd75d5c4a198ba94220 Mon Sep 17 00:00:00 2001
> From: Jan Hendrik Farr <kernel@jfarr.cc>
> Date: Thu, 17 Oct 2024 04:39:40 +0200
> Subject: [PATCH] Compiler Attributes: disable __counted_by for clang < 19.1.3
> 
> This patch disables __counted_by for clang versions < 19.1.3 because
> of the two issues listed below. It does this by introducing
> CONFIG_CC_HAS_COUNTED_BY.
> 
> 1. clang < 19.1.2 has a bug that can lead to __bdos returning 0:
> https://github.com/llvm/llvm-project/pull/110497
> 
> 2. clang < 19.1.3 has a bug that can lead to __bdos being off by 4:
> https://github.com/llvm/llvm-project/pull/112636
> 
> Fixes: c8248faf3ca2 ("Compiler Attributes: counted_by: Adjust name and identifier expansion")
> Cc: stable@vger.kernel.org # 6.6.x: 16c31dd7fdf6: Compiler Attributes: counted_by: bump min gcc version
> Cc: stable@vger.kernel.org # 6.6.x: 2993eb7a8d34: Compiler Attributes: counted_by: fixup clang URL
> Cc: stable@vger.kernel.org # 6.6.x: 231dc3f0c936: lkdtm/bugs: Improve warning message for compilers without counted_by support
> Cc: stable@vger.kernel.org # 6.6.x
> Reported-by: Nathan Chancellor <nathan@kernel.org>
> Closes: https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202409260949.a1254989-oliver.sang@intel.com
> Link: https://lore.kernel.org/all/Zw8iawAF5W2uzGuh@archlinux/T/#m204c09f63c076586a02d194b87dffc7e81b8de7b
> Signed-off-by: Jan Hendrik Farr <kernel@jfarr.cc>

If you do not add the Co-developed-by, feel free to carry forward

Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>

on the official submission.

> ---
>  drivers/misc/lkdtm/bugs.c           |  2 +-
>  include/linux/compiler_attributes.h | 13 -------------
>  include/linux/compiler_types.h      | 19 +++++++++++++++++++
>  init/Kconfig                        |  8 ++++++++
>  lib/overflow_kunit.c                |  2 +-
>  5 files changed, 29 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
> index 62ba01525479..376047beea3d 100644
> --- a/drivers/misc/lkdtm/bugs.c
> +++ b/drivers/misc/lkdtm/bugs.c
> @@ -445,7 +445,7 @@ static void lkdtm_FAM_BOUNDS(void)
>  
>  	pr_err("FAIL: survived access of invalid flexible array member index!\n");
>  
> -	if (!__has_attribute(__counted_by__))
> +	if (!IS_ENABLED(CONFIG_CC_HAS_COUNTED_BY))
>  		pr_warn("This is expected since this %s was built with a compiler that does not support __counted_by\n",
>  			lkdtm_kernel_info);
>  	else if (IS_ENABLED(CONFIG_UBSAN_BOUNDS))
> diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
> index 32284cd26d52..c16d4199bf92 100644
> --- a/include/linux/compiler_attributes.h
> +++ b/include/linux/compiler_attributes.h
> @@ -94,19 +94,6 @@
>  # define __copy(symbol)
>  #endif
>  
> -/*
> - * Optional: only supported since gcc >= 15
> - * Optional: only supported since clang >= 18
> - *
> - *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
> - * clang: https://github.com/llvm/llvm-project/pull/76348
> - */
> -#if __has_attribute(__counted_by__)
> -# define __counted_by(member)		__attribute__((__counted_by__(member)))
> -#else
> -# define __counted_by(member)
> -#endif
> -
>  /*
>   * Optional: not supported by gcc
>   * Optional: only supported since clang >= 14.0
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index 1a957ea2f4fe..639be0f30b45 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -323,6 +323,25 @@ struct ftrace_likely_data {
>  #define __no_sanitize_or_inline __always_inline
>  #endif
>  
> +/*
> + * Optional: only supported since gcc >= 15
> + * Optional: only supported since clang >= 18
> + *
> + *   gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
> + * clang: https://github.com/llvm/llvm-project/pull/76348
> + *
> + * __bdos on clang < 19.1.2 can erroneously return 0:
> + * https://github.com/llvm/llvm-project/pull/110497
> + *
> + * __bdos on clang < 19.1.3 can be off by 4:
> + * https://github.com/llvm/llvm-project/pull/112636
> + */
> +#ifdef CONFIG_CC_HAS_COUNTED_BY
> +# define __counted_by(member)		__attribute__((__counted_by__(member)))
> +#else
> +# define __counted_by(member)
> +#endif
> +
>  /*
>   * Apply __counted_by() when the Endianness matches to increase test coverage.
>   */
> diff --git a/init/Kconfig b/init/Kconfig
> index 530a382ee0fe..5f1fe3583f20 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -116,6 +116,14 @@ config CC_HAS_ASM_INLINE
>  config CC_HAS_NO_PROFILE_FN_ATTR
>  	def_bool $(success,echo '__attribute__((no_profile_instrument_function)) int x();' | $(CC) -x c - -c -o /dev/null -Werror)
>  
> +# clang needs to be at least 19.1.3 to avoid __bdos miscalculations
> +# https://github.com/llvm/llvm-project/pull/110497
> +# https://github.com/llvm/llvm-project/pull/112636
> +# TODO: when gcc 15 is released remove the build test and add gcc version check
> +config CC_HAS_COUNTED_BY
> +	def_bool $(success,echo 'struct flex { int count; int array[] __attribute__((__counted_by__(count))); };' | $(CC) $(CLANG_FLAGS) -x c - -c -o /dev/null -Werror)
> +	depends on !(CC_IS_CLANG && CLANG_VERSION < 190103)
> +
>  config PAHOLE_VERSION
>  	int
>  	default $(shell,$(srctree)/scripts/pahole-version.sh $(PAHOLE))
> diff --git a/lib/overflow_kunit.c b/lib/overflow_kunit.c
> index 2abc78367dd1..5222c6393f11 100644
> --- a/lib/overflow_kunit.c
> +++ b/lib/overflow_kunit.c
> @@ -1187,7 +1187,7 @@ static void DEFINE_FLEX_test(struct kunit *test)
>  {
>  	/* Using _RAW_ on a __counted_by struct will initialize "counter" to zero */
>  	DEFINE_RAW_FLEX(struct foo, two_but_zero, array, 2);
> -#if __has_attribute(__counted_by__)
> +#ifdef CONFIG_CC_HAS_COUNTED_BY
>  	int expected_raw_size = sizeof(struct foo);
>  #else
>  	int expected_raw_size = sizeof(struct foo) + 2 * sizeof(s16);
> -- 
> 2.47.0
Miguel Ojeda Oct. 25, 2024, 8:10 a.m. UTC | #53
On Fri, Oct 25, 2024 at 3:15 AM Nathan Chancellor <nathan@kernel.org> wrote:
>
> on the official submission.

Same -- please feel free to add:

Reviewed-by: Miguel Ojeda <ojeda@kernel.org>

One nit below that is fine either way:

> > +# clang needs to be at least 19.1.3 to avoid __bdos miscalculations
> > +# https://github.com/llvm/llvm-project/pull/110497
> > +# https://github.com/llvm/llvm-project/pull/112636
> > +# TODO: when gcc 15 is released remove the build test and add gcc version check

I would perhaps move these closer to the respective lines they are
comment on (i.e. `depends on` and `def_bool`).

Thanks!

Cheers,
Miguel
Jan Hendrik Farr Oct. 25, 2024, 3:27 p.m. UTC | #54
On 25 10:10:38, Miguel Ojeda wrote:
> On Fri, Oct 25, 2024 at 3:15 AM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > on the official submission.
> 
> Same -- please feel free to add:
> 
> Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
> 
> One nit below that is fine either way:
> 
> > > +# clang needs to be at least 19.1.3 to avoid __bdos miscalculations
> > > +# https://github.com/llvm/llvm-project/pull/110497
> > > +# https://github.com/llvm/llvm-project/pull/112636
> > > +# TODO: when gcc 15 is released remove the build test and add gcc version check
> 
> I would perhaps move these closer to the respective lines they are
> comment on (i.e. `depends on` and `def_bool`).
> 

Done, thanks!

config CC_HAS_COUNTED_BY
	# TODO: when gcc 15 is released remove the build test and add
	# a gcc version check
	def_bool $(success,echo 'struct flex { int count; int array[] __attribute__((__counted_by__(count))); };' | $(CC) $(CLANG_FLAGS) -x c - -c -o /dev/null -Werror)
	# clang needs to be at least 19.1.3 to avoid __bdos miscalculations
	# https://github.com/llvm/llvm-project/pull/110497
	# https://github.com/llvm/llvm-project/pull/112636
	depends on !(CC_IS_CLANG && CLANG_VERSION < 190103)
diff mbox series

Patch

diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
index 56c8d3fe55a4..8d7e749b7dda 100644
--- a/fs/bcachefs/xattr.c
+++ b/fs/bcachefs/xattr.c
@@ -74,6 +74,7 @@  int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
 		       enum bch_validate_flags flags)
 {
 	struct bkey_s_c_xattr xattr = bkey_s_c_to_xattr(k);
+	const struct bch_xattr *v = (void *)k.v;
 	unsigned val_u64s = xattr_val_u64s(xattr.v->x_name_len,
 					   le16_to_cpu(xattr.v->x_val_len));
 	int ret = 0;
@@ -94,9 +95,12 @@  int bch2_xattr_validate(struct bch_fs *c, struct bkey_s_c k,
 
 	bkey_fsck_err_on(!bch2_xattr_type_to_handler(xattr.v->x_type),
 			 c, xattr_invalid_type,
-			 "invalid type (%u)", xattr.v->x_type);
+			 "invalid type (%u)", v->x_type);
 
-	bkey_fsck_err_on(memchr(xattr.v->x_name, '\0', xattr.v->x_name_len),
+	pr_info("x_name_len: %d", v->x_name_len);
+	pr_info("__struct_size(x_name): %ld", __struct_size(v->x_name));
+	pr_info("__struct_size(x_name): %ld", __struct_size(xattr.v->x_name));
+	bkey_fsck_err_on(memchr(v->x_name, '\0', v->x_name_len),
 			 c, xattr_name_invalid_chars,
 			 "xattr name has invalid characters");
 fsck_err: