Message ID | 20240324050554.1609460-1-syoshida@redhat.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] ipv4: Fix uninit-value access in __ip_make_skb() | expand |
On Sun, Mar 24, 2024 at 6:06 AM Shigeru Yoshida <syoshida@redhat.com> wrote: > > KMSAN reported uninit-value access in __ip_make_skb() [1]. __ip_make_skb() > tests HDRINCL to know if the skb has icmphdr. However, HDRINCL can cause a > race condition. If calling setsockopt(2) with IP_HDRINCL changes HDRINCL > while __ip_make_skb() is running, the function will access icmphdr in the > skb even if it is not included. This causes the issue reported by KMSAN. > > Check FLOWI_FLAG_KNOWN_NH on fl4->flowi4_flags instead of testing HDRINCL > on the socket. > > [1] What is the kernel version for this trace ? > BUG: KMSAN: uninit-value in __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 > __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 > ip_finish_skb include/net/ip.h:243 [inline] > ip_push_pending_frames+0x4c/0x5c0 net/ipv4/ip_output.c:1508 > raw_sendmsg+0x2381/0x2690 net/ipv4/raw.c:654 > inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 > sock_sendmsg_nosec net/socket.c:730 [inline] > __sock_sendmsg+0x274/0x3c0 net/socket.c:745 > __sys_sendto+0x62c/0x7b0 net/socket.c:2191 > __do_sys_sendto net/socket.c:2203 [inline] > __se_sys_sendto net/socket.c:2199 [inline] > __x64_sys_sendto+0x130/0x200 net/socket.c:2199 > do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > Uninit was created at: > slab_post_alloc_hook mm/slub.c:3804 [inline] > slab_alloc_node mm/slub.c:3845 [inline] > kmem_cache_alloc_node+0x5f6/0xc50 mm/slub.c:3888 > kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:577 > __alloc_skb+0x35a/0x7c0 net/core/skbuff.c:668 > alloc_skb include/linux/skbuff.h:1318 [inline] > __ip_append_data+0x49ab/0x68c0 net/ipv4/ip_output.c:1128 > ip_append_data+0x1e7/0x260 net/ipv4/ip_output.c:1365 > raw_sendmsg+0x22b1/0x2690 net/ipv4/raw.c:648 > inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 > sock_sendmsg_nosec net/socket.c:730 [inline] > __sock_sendmsg+0x274/0x3c0 net/socket.c:745 > __sys_sendto+0x62c/0x7b0 net/socket.c:2191 > __do_sys_sendto net/socket.c:2203 [inline] > __se_sys_sendto net/socket.c:2199 [inline] > __x64_sys_sendto+0x130/0x200 net/socket.c:2199 > do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > Fixes: 99e5acae193e ("ipv4: Fix potential uninit variable access bug in __ip_make_skb()") > Reported-by: syzkaller <syzkaller@googlegroups.com> > Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> > --- > I think IPv6 has a similar issue. If this patch is accepted, I will send > a patch for IPv6. > --- > net/ipv4/ip_output.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c > index 1fe794967211..39229fd0601a 100644 > --- a/net/ipv4/ip_output.c > +++ b/net/ipv4/ip_output.c > @@ -1473,7 +1473,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk, > * by icmp_hdr(skb)->type. > */ > if (sk->sk_type == SOCK_RAW && > - !inet_test_bit(HDRINCL, sk)) > + !(fl4->flowi4_flags & FLOWI_FLAG_KNOWN_NH)) > icmp_type = fl4->fl4_icmp_type; > else > icmp_type = icmp_hdr(skb)->type; > -- > 2.44.0 > Thanks for your patch. I do not think this is enough, as far as syzkaller is concerned. raw_probe_proto_opt() can leave garbage in fl4_icmp_type (and fl4_icmp_code)
On Mon, 25 Mar 2024 10:01:25 +0100, Eric Dumazet wrote: > On Sun, Mar 24, 2024 at 6:06 AM Shigeru Yoshida <syoshida@redhat.com> wrote: >> >> KMSAN reported uninit-value access in __ip_make_skb() [1]. __ip_make_skb() >> tests HDRINCL to know if the skb has icmphdr. However, HDRINCL can cause a >> race condition. If calling setsockopt(2) with IP_HDRINCL changes HDRINCL >> while __ip_make_skb() is running, the function will access icmphdr in the >> skb even if it is not included. This causes the issue reported by KMSAN. >> >> Check FLOWI_FLAG_KNOWN_NH on fl4->flowi4_flags instead of testing HDRINCL >> on the socket. >> >> [1] > > What is the kernel version for this trace ? Sorry, I used the following version: CPU: 1 PID: 15709 Comm: syz-executor.7 Not tainted 6.8.0-11567-gb3603fcb79b1 #25 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 >> BUG: KMSAN: uninit-value in __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 >> __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 >> ip_finish_skb include/net/ip.h:243 [inline] >> ip_push_pending_frames+0x4c/0x5c0 net/ipv4/ip_output.c:1508 >> raw_sendmsg+0x2381/0x2690 net/ipv4/raw.c:654 >> inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 >> sock_sendmsg_nosec net/socket.c:730 [inline] >> __sock_sendmsg+0x274/0x3c0 net/socket.c:745 >> __sys_sendto+0x62c/0x7b0 net/socket.c:2191 >> __do_sys_sendto net/socket.c:2203 [inline] >> __se_sys_sendto net/socket.c:2199 [inline] >> __x64_sys_sendto+0x130/0x200 net/socket.c:2199 >> do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 >> entry_SYSCALL_64_after_hwframe+0x6d/0x75 >> >> Uninit was created at: >> slab_post_alloc_hook mm/slub.c:3804 [inline] >> slab_alloc_node mm/slub.c:3845 [inline] >> kmem_cache_alloc_node+0x5f6/0xc50 mm/slub.c:3888 >> kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:577 >> __alloc_skb+0x35a/0x7c0 net/core/skbuff.c:668 >> alloc_skb include/linux/skbuff.h:1318 [inline] >> __ip_append_data+0x49ab/0x68c0 net/ipv4/ip_output.c:1128 >> ip_append_data+0x1e7/0x260 net/ipv4/ip_output.c:1365 >> raw_sendmsg+0x22b1/0x2690 net/ipv4/raw.c:648 >> inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 >> sock_sendmsg_nosec net/socket.c:730 [inline] >> __sock_sendmsg+0x274/0x3c0 net/socket.c:745 >> __sys_sendto+0x62c/0x7b0 net/socket.c:2191 >> __do_sys_sendto net/socket.c:2203 [inline] >> __se_sys_sendto net/socket.c:2199 [inline] >> __x64_sys_sendto+0x130/0x200 net/socket.c:2199 >> do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 >> entry_SYSCALL_64_after_hwframe+0x6d/0x75 >> >> Fixes: 99e5acae193e ("ipv4: Fix potential uninit variable access bug in __ip_make_skb()") >> Reported-by: syzkaller <syzkaller@googlegroups.com> >> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> >> --- >> I think IPv6 has a similar issue. If this patch is accepted, I will send >> a patch for IPv6. >> --- >> net/ipv4/ip_output.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c >> index 1fe794967211..39229fd0601a 100644 >> --- a/net/ipv4/ip_output.c >> +++ b/net/ipv4/ip_output.c >> @@ -1473,7 +1473,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk, >> * by icmp_hdr(skb)->type. >> */ >> if (sk->sk_type == SOCK_RAW && >> - !inet_test_bit(HDRINCL, sk)) >> + !(fl4->flowi4_flags & FLOWI_FLAG_KNOWN_NH)) >> icmp_type = fl4->fl4_icmp_type; >> else >> icmp_type = icmp_hdr(skb)->type; >> -- >> 2.44.0 >> > > Thanks for your patch. > > I do not think this is enough, as far as syzkaller is concerned. > > raw_probe_proto_opt() can leave garbage in fl4_icmp_type (and fl4_icmp_code) Thank you for your comment. But I don't understand it clearly. What exactly do you mean by "garbage"? raw_probe_proto_opt() immediately returns 0 if fl4->flowi4_proto is not IPPROTO_ICMP: static int raw_probe_proto_opt(struct raw_frag_vec *rfv, struct flowi4 *fl4) { int err; if (fl4->flowi4_proto != IPPROTO_ICMP) return 0; In this case, the function doesn't set fl4_icmp_type. Do you mean this case? Thanks, Shigeru
On Mon, Mar 25, 2024 at 10:38 AM Shigeru Yoshida <syoshida@redhat.com> wrote: > > On Mon, 25 Mar 2024 10:01:25 +0100, Eric Dumazet wrote: > > On Sun, Mar 24, 2024 at 6:06 AM Shigeru Yoshida <syoshida@redhat.com> wrote: > >> > >> KMSAN reported uninit-value access in __ip_make_skb() [1]. __ip_make_skb() > >> tests HDRINCL to know if the skb has icmphdr. However, HDRINCL can cause a > >> race condition. If calling setsockopt(2) with IP_HDRINCL changes HDRINCL > >> while __ip_make_skb() is running, the function will access icmphdr in the > >> skb even if it is not included. This causes the issue reported by KMSAN. > >> > >> Check FLOWI_FLAG_KNOWN_NH on fl4->flowi4_flags instead of testing HDRINCL > >> on the socket. > >> > >> [1] > > > > What is the kernel version for this trace ? > > Sorry, I used the following version: > > CPU: 1 PID: 15709 Comm: syz-executor.7 Not tainted 6.8.0-11567-gb3603fcb79b1 #25 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 > > >> BUG: KMSAN: uninit-value in __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 > >> __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 > >> ip_finish_skb include/net/ip.h:243 [inline] > >> ip_push_pending_frames+0x4c/0x5c0 net/ipv4/ip_output.c:1508 > >> raw_sendmsg+0x2381/0x2690 net/ipv4/raw.c:654 > >> inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 > >> sock_sendmsg_nosec net/socket.c:730 [inline] > >> __sock_sendmsg+0x274/0x3c0 net/socket.c:745 > >> __sys_sendto+0x62c/0x7b0 net/socket.c:2191 > >> __do_sys_sendto net/socket.c:2203 [inline] > >> __se_sys_sendto net/socket.c:2199 [inline] > >> __x64_sys_sendto+0x130/0x200 net/socket.c:2199 > >> do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 > >> entry_SYSCALL_64_after_hwframe+0x6d/0x75 > >> > >> Uninit was created at: > >> slab_post_alloc_hook mm/slub.c:3804 [inline] > >> slab_alloc_node mm/slub.c:3845 [inline] > >> kmem_cache_alloc_node+0x5f6/0xc50 mm/slub.c:3888 > >> kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:577 > >> __alloc_skb+0x35a/0x7c0 net/core/skbuff.c:668 > >> alloc_skb include/linux/skbuff.h:1318 [inline] > >> __ip_append_data+0x49ab/0x68c0 net/ipv4/ip_output.c:1128 > >> ip_append_data+0x1e7/0x260 net/ipv4/ip_output.c:1365 > >> raw_sendmsg+0x22b1/0x2690 net/ipv4/raw.c:648 > >> inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 > >> sock_sendmsg_nosec net/socket.c:730 [inline] > >> __sock_sendmsg+0x274/0x3c0 net/socket.c:745 > >> __sys_sendto+0x62c/0x7b0 net/socket.c:2191 > >> __do_sys_sendto net/socket.c:2203 [inline] > >> __se_sys_sendto net/socket.c:2199 [inline] > >> __x64_sys_sendto+0x130/0x200 net/socket.c:2199 > >> do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 > >> entry_SYSCALL_64_after_hwframe+0x6d/0x75 > >> > >> Fixes: 99e5acae193e ("ipv4: Fix potential uninit variable access bug in __ip_make_skb()") > >> Reported-by: syzkaller <syzkaller@googlegroups.com> > >> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> > >> --- > >> I think IPv6 has a similar issue. If this patch is accepted, I will send > >> a patch for IPv6. > >> --- > >> net/ipv4/ip_output.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c > >> index 1fe794967211..39229fd0601a 100644 > >> --- a/net/ipv4/ip_output.c > >> +++ b/net/ipv4/ip_output.c > >> @@ -1473,7 +1473,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk, > >> * by icmp_hdr(skb)->type. > >> */ > >> if (sk->sk_type == SOCK_RAW && > >> - !inet_test_bit(HDRINCL, sk)) > >> + !(fl4->flowi4_flags & FLOWI_FLAG_KNOWN_NH)) > >> icmp_type = fl4->fl4_icmp_type; > >> else > >> icmp_type = icmp_hdr(skb)->type; > >> -- > >> 2.44.0 > >> > > > > Thanks for your patch. > > > > I do not think this is enough, as far as syzkaller is concerned. > > > > raw_probe_proto_opt() can leave garbage in fl4_icmp_type (and fl4_icmp_code) > > Thank you for your comment. But I don't understand it clearly. What > exactly do you mean by "garbage"? > > raw_probe_proto_opt() immediately returns 0 if fl4->flowi4_proto is > not IPPROTO_ICMP: > > static int raw_probe_proto_opt(struct raw_frag_vec *rfv, struct flowi4 *fl4) > { > int err; > > if (fl4->flowi4_proto != IPPROTO_ICMP) > return 0; > > In this case, the function doesn't set fl4_icmp_type. Do you mean this > case? There are multiple ways to return early from this function. In all of them, fl4->fl4_icmp_type is left uninitialized, so syzbot will find ways to trigger a related bug, if you assume later that fl4->fl4_icmp_type contains valid (initialized) data.
On Mon, 25 Mar 2024 11:05:33 +0100, Eric Dumazet wrote: > On Mon, Mar 25, 2024 at 10:38 AM Shigeru Yoshida <syoshida@redhat.com> wrote: >> >> On Mon, 25 Mar 2024 10:01:25 +0100, Eric Dumazet wrote: >> > On Sun, Mar 24, 2024 at 6:06 AM Shigeru Yoshida <syoshida@redhat.com> wrote: >> >> >> >> KMSAN reported uninit-value access in __ip_make_skb() [1]. __ip_make_skb() >> >> tests HDRINCL to know if the skb has icmphdr. However, HDRINCL can cause a >> >> race condition. If calling setsockopt(2) with IP_HDRINCL changes HDRINCL >> >> while __ip_make_skb() is running, the function will access icmphdr in the >> >> skb even if it is not included. This causes the issue reported by KMSAN. >> >> >> >> Check FLOWI_FLAG_KNOWN_NH on fl4->flowi4_flags instead of testing HDRINCL >> >> on the socket. >> >> >> >> [1] >> > >> > What is the kernel version for this trace ? >> >> Sorry, I used the following version: >> >> CPU: 1 PID: 15709 Comm: syz-executor.7 Not tainted 6.8.0-11567-gb3603fcb79b1 #25 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 >> >> >> BUG: KMSAN: uninit-value in __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 >> >> __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 >> >> ip_finish_skb include/net/ip.h:243 [inline] >> >> ip_push_pending_frames+0x4c/0x5c0 net/ipv4/ip_output.c:1508 >> >> raw_sendmsg+0x2381/0x2690 net/ipv4/raw.c:654 >> >> inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 >> >> sock_sendmsg_nosec net/socket.c:730 [inline] >> >> __sock_sendmsg+0x274/0x3c0 net/socket.c:745 >> >> __sys_sendto+0x62c/0x7b0 net/socket.c:2191 >> >> __do_sys_sendto net/socket.c:2203 [inline] >> >> __se_sys_sendto net/socket.c:2199 [inline] >> >> __x64_sys_sendto+0x130/0x200 net/socket.c:2199 >> >> do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 >> >> entry_SYSCALL_64_after_hwframe+0x6d/0x75 >> >> >> >> Uninit was created at: >> >> slab_post_alloc_hook mm/slub.c:3804 [inline] >> >> slab_alloc_node mm/slub.c:3845 [inline] >> >> kmem_cache_alloc_node+0x5f6/0xc50 mm/slub.c:3888 >> >> kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:577 >> >> __alloc_skb+0x35a/0x7c0 net/core/skbuff.c:668 >> >> alloc_skb include/linux/skbuff.h:1318 [inline] >> >> __ip_append_data+0x49ab/0x68c0 net/ipv4/ip_output.c:1128 >> >> ip_append_data+0x1e7/0x260 net/ipv4/ip_output.c:1365 >> >> raw_sendmsg+0x22b1/0x2690 net/ipv4/raw.c:648 >> >> inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 >> >> sock_sendmsg_nosec net/socket.c:730 [inline] >> >> __sock_sendmsg+0x274/0x3c0 net/socket.c:745 >> >> __sys_sendto+0x62c/0x7b0 net/socket.c:2191 >> >> __do_sys_sendto net/socket.c:2203 [inline] >> >> __se_sys_sendto net/socket.c:2199 [inline] >> >> __x64_sys_sendto+0x130/0x200 net/socket.c:2199 >> >> do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 >> >> entry_SYSCALL_64_after_hwframe+0x6d/0x75 >> >> >> >> Fixes: 99e5acae193e ("ipv4: Fix potential uninit variable access bug in __ip_make_skb()") >> >> Reported-by: syzkaller <syzkaller@googlegroups.com> >> >> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> >> >> --- >> >> I think IPv6 has a similar issue. If this patch is accepted, I will send >> >> a patch for IPv6. >> >> --- >> >> net/ipv4/ip_output.c | 2 +- >> >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> >> >> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c >> >> index 1fe794967211..39229fd0601a 100644 >> >> --- a/net/ipv4/ip_output.c >> >> +++ b/net/ipv4/ip_output.c >> >> @@ -1473,7 +1473,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk, >> >> * by icmp_hdr(skb)->type. >> >> */ >> >> if (sk->sk_type == SOCK_RAW && >> >> - !inet_test_bit(HDRINCL, sk)) >> >> + !(fl4->flowi4_flags & FLOWI_FLAG_KNOWN_NH)) >> >> icmp_type = fl4->fl4_icmp_type; >> >> else >> >> icmp_type = icmp_hdr(skb)->type; >> >> -- >> >> 2.44.0 >> >> >> > >> > Thanks for your patch. >> > >> > I do not think this is enough, as far as syzkaller is concerned. >> > >> > raw_probe_proto_opt() can leave garbage in fl4_icmp_type (and fl4_icmp_code) >> >> Thank you for your comment. But I don't understand it clearly. What >> exactly do you mean by "garbage"? >> >> raw_probe_proto_opt() immediately returns 0 if fl4->flowi4_proto is >> not IPPROTO_ICMP: >> >> static int raw_probe_proto_opt(struct raw_frag_vec *rfv, struct flowi4 *fl4) >> { >> int err; >> >> if (fl4->flowi4_proto != IPPROTO_ICMP) >> return 0; >> >> In this case, the function doesn't set fl4_icmp_type. Do you mean this >> case? > > There are multiple ways to return early from this function. > > In all of them, fl4->fl4_icmp_type is left uninitialized, so syzbot > will find ways to trigger a related bug, > if you assume later that fl4->fl4_icmp_type contains valid (initialized) data. Thank you for your reply. I see your point. fl4->fl4_icmp_type is part of flowi_uli union in flowi4 structure, and flowi4_init_output() initializes fl4_dport and fl4_sport to zero. I thought this also initializes fl4_icmp_type and fl4_icmp_code. Do you think we should initialize fl4_icmp_type and fl4_icmp_code explicitly, otherwise am I misunderstanding? Thanks, Shigeru
On Mon, Mar 25, 2024 at 6:46 PM Shigeru Yoshida <syoshida@redhat.com> wrote: > > On Mon, 25 Mar 2024 11:05:33 +0100, Eric Dumazet wrote: > > On Mon, Mar 25, 2024 at 10:38 AM Shigeru Yoshida <syoshida@redhat.com> wrote: > >> > >> On Mon, 25 Mar 2024 10:01:25 +0100, Eric Dumazet wrote: > >> > On Sun, Mar 24, 2024 at 6:06 AM Shigeru Yoshida <syoshida@redhat.com> wrote: > >> >> > >> >> KMSAN reported uninit-value access in __ip_make_skb() [1]. __ip_make_skb() > >> >> tests HDRINCL to know if the skb has icmphdr. However, HDRINCL can cause a > >> >> race condition. If calling setsockopt(2) with IP_HDRINCL changes HDRINCL > >> >> while __ip_make_skb() is running, the function will access icmphdr in the > >> >> skb even if it is not included. This causes the issue reported by KMSAN. > >> >> > >> >> Check FLOWI_FLAG_KNOWN_NH on fl4->flowi4_flags instead of testing HDRINCL > >> >> on the socket. > >> >> > >> >> [1] > >> > > >> > What is the kernel version for this trace ? > >> > >> Sorry, I used the following version: > >> > >> CPU: 1 PID: 15709 Comm: syz-executor.7 Not tainted 6.8.0-11567-gb3603fcb79b1 #25 > >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 > >> > >> >> BUG: KMSAN: uninit-value in __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 > >> >> __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 > >> >> ip_finish_skb include/net/ip.h:243 [inline] > >> >> ip_push_pending_frames+0x4c/0x5c0 net/ipv4/ip_output.c:1508 > >> >> raw_sendmsg+0x2381/0x2690 net/ipv4/raw.c:654 > >> >> inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 > >> >> sock_sendmsg_nosec net/socket.c:730 [inline] > >> >> __sock_sendmsg+0x274/0x3c0 net/socket.c:745 > >> >> __sys_sendto+0x62c/0x7b0 net/socket.c:2191 > >> >> __do_sys_sendto net/socket.c:2203 [inline] > >> >> __se_sys_sendto net/socket.c:2199 [inline] > >> >> __x64_sys_sendto+0x130/0x200 net/socket.c:2199 > >> >> do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 > >> >> entry_SYSCALL_64_after_hwframe+0x6d/0x75 > >> >> > >> >> Uninit was created at: > >> >> slab_post_alloc_hook mm/slub.c:3804 [inline] > >> >> slab_alloc_node mm/slub.c:3845 [inline] > >> >> kmem_cache_alloc_node+0x5f6/0xc50 mm/slub.c:3888 > >> >> kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:577 > >> >> __alloc_skb+0x35a/0x7c0 net/core/skbuff.c:668 > >> >> alloc_skb include/linux/skbuff.h:1318 [inline] > >> >> __ip_append_data+0x49ab/0x68c0 net/ipv4/ip_output.c:1128 > >> >> ip_append_data+0x1e7/0x260 net/ipv4/ip_output.c:1365 > >> >> raw_sendmsg+0x22b1/0x2690 net/ipv4/raw.c:648 > >> >> inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 > >> >> sock_sendmsg_nosec net/socket.c:730 [inline] > >> >> __sock_sendmsg+0x274/0x3c0 net/socket.c:745 > >> >> __sys_sendto+0x62c/0x7b0 net/socket.c:2191 > >> >> __do_sys_sendto net/socket.c:2203 [inline] > >> >> __se_sys_sendto net/socket.c:2199 [inline] > >> >> __x64_sys_sendto+0x130/0x200 net/socket.c:2199 > >> >> do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 > >> >> entry_SYSCALL_64_after_hwframe+0x6d/0x75 > >> >> > >> >> Fixes: 99e5acae193e ("ipv4: Fix potential uninit variable access bug in __ip_make_skb()") > >> >> Reported-by: syzkaller <syzkaller@googlegroups.com> > >> >> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> > >> >> --- > >> >> I think IPv6 has a similar issue. If this patch is accepted, I will send > >> >> a patch for IPv6. > >> >> --- > >> >> net/ipv4/ip_output.c | 2 +- > >> >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> >> > >> >> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c > >> >> index 1fe794967211..39229fd0601a 100644 > >> >> --- a/net/ipv4/ip_output.c > >> >> +++ b/net/ipv4/ip_output.c > >> >> @@ -1473,7 +1473,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk, > >> >> * by icmp_hdr(skb)->type. > >> >> */ > >> >> if (sk->sk_type == SOCK_RAW && > >> >> - !inet_test_bit(HDRINCL, sk)) > >> >> + !(fl4->flowi4_flags & FLOWI_FLAG_KNOWN_NH)) > >> >> icmp_type = fl4->fl4_icmp_type; > >> >> else > >> >> icmp_type = icmp_hdr(skb)->type; > >> >> -- > >> >> 2.44.0 > >> >> > >> > > >> > Thanks for your patch. > >> > > >> > I do not think this is enough, as far as syzkaller is concerned. > >> > > >> > raw_probe_proto_opt() can leave garbage in fl4_icmp_type (and fl4_icmp_code) > >> > >> Thank you for your comment. But I don't understand it clearly. What > >> exactly do you mean by "garbage"? > >> > >> raw_probe_proto_opt() immediately returns 0 if fl4->flowi4_proto is > >> not IPPROTO_ICMP: > >> > >> static int raw_probe_proto_opt(struct raw_frag_vec *rfv, struct flowi4 *fl4) > >> { > >> int err; > >> > >> if (fl4->flowi4_proto != IPPROTO_ICMP) > >> return 0; > >> > >> In this case, the function doesn't set fl4_icmp_type. Do you mean this > >> case? > > > > There are multiple ways to return early from this function. > > > > In all of them, fl4->fl4_icmp_type is left uninitialized, so syzbot > > will find ways to trigger a related bug, > > if you assume later that fl4->fl4_icmp_type contains valid (initialized) data. > > Thank you for your reply. I see your point. > > fl4->fl4_icmp_type is part of flowi_uli union in flowi4 structure, and > flowi4_init_output() initializes fl4_dport and fl4_sport to zero. > > I thought this also initializes fl4_icmp_type and fl4_icmp_code. Do > you think we should initialize fl4_icmp_type and fl4_icmp_code > explicitly, otherwise am I misunderstanding? Yes, I am precisely saying this : do not rely on some union layout, without mentioning it in the changelog or even better in a comment. If you want to avoid clearing these fields, please add a BUILD_BUG_ON() to make sure a unrelated future change in include/net/flow.h does not break a hidden assumption. (ie : clearing fl4_dport and fl4_sport also clears fl4_icmp_type and fl4_icmp_code.)
On Mon, 25 Mar 2024 20:56:43 +0100, Eric Dumazet wrote: > On Mon, Mar 25, 2024 at 6:46 PM Shigeru Yoshida <syoshida@redhat.com> wrote: >> >> On Mon, 25 Mar 2024 11:05:33 +0100, Eric Dumazet wrote: >> > On Mon, Mar 25, 2024 at 10:38 AM Shigeru Yoshida <syoshida@redhat.com> wrote: >> >> >> >> On Mon, 25 Mar 2024 10:01:25 +0100, Eric Dumazet wrote: >> >> > On Sun, Mar 24, 2024 at 6:06 AM Shigeru Yoshida <syoshida@redhat.com> wrote: >> >> >> >> >> >> KMSAN reported uninit-value access in __ip_make_skb() [1]. __ip_make_skb() >> >> >> tests HDRINCL to know if the skb has icmphdr. However, HDRINCL can cause a >> >> >> race condition. If calling setsockopt(2) with IP_HDRINCL changes HDRINCL >> >> >> while __ip_make_skb() is running, the function will access icmphdr in the >> >> >> skb even if it is not included. This causes the issue reported by KMSAN. >> >> >> >> >> >> Check FLOWI_FLAG_KNOWN_NH on fl4->flowi4_flags instead of testing HDRINCL >> >> >> on the socket. >> >> >> >> >> >> [1] >> >> > >> >> > What is the kernel version for this trace ? >> >> >> >> Sorry, I used the following version: >> >> >> >> CPU: 1 PID: 15709 Comm: syz-executor.7 Not tainted 6.8.0-11567-gb3603fcb79b1 #25 >> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 >> >> >> >> >> BUG: KMSAN: uninit-value in __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 >> >> >> __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 >> >> >> ip_finish_skb include/net/ip.h:243 [inline] >> >> >> ip_push_pending_frames+0x4c/0x5c0 net/ipv4/ip_output.c:1508 >> >> >> raw_sendmsg+0x2381/0x2690 net/ipv4/raw.c:654 >> >> >> inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 >> >> >> sock_sendmsg_nosec net/socket.c:730 [inline] >> >> >> __sock_sendmsg+0x274/0x3c0 net/socket.c:745 >> >> >> __sys_sendto+0x62c/0x7b0 net/socket.c:2191 >> >> >> __do_sys_sendto net/socket.c:2203 [inline] >> >> >> __se_sys_sendto net/socket.c:2199 [inline] >> >> >> __x64_sys_sendto+0x130/0x200 net/socket.c:2199 >> >> >> do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 >> >> >> entry_SYSCALL_64_after_hwframe+0x6d/0x75 >> >> >> >> >> >> Uninit was created at: >> >> >> slab_post_alloc_hook mm/slub.c:3804 [inline] >> >> >> slab_alloc_node mm/slub.c:3845 [inline] >> >> >> kmem_cache_alloc_node+0x5f6/0xc50 mm/slub.c:3888 >> >> >> kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:577 >> >> >> __alloc_skb+0x35a/0x7c0 net/core/skbuff.c:668 >> >> >> alloc_skb include/linux/skbuff.h:1318 [inline] >> >> >> __ip_append_data+0x49ab/0x68c0 net/ipv4/ip_output.c:1128 >> >> >> ip_append_data+0x1e7/0x260 net/ipv4/ip_output.c:1365 >> >> >> raw_sendmsg+0x22b1/0x2690 net/ipv4/raw.c:648 >> >> >> inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 >> >> >> sock_sendmsg_nosec net/socket.c:730 [inline] >> >> >> __sock_sendmsg+0x274/0x3c0 net/socket.c:745 >> >> >> __sys_sendto+0x62c/0x7b0 net/socket.c:2191 >> >> >> __do_sys_sendto net/socket.c:2203 [inline] >> >> >> __se_sys_sendto net/socket.c:2199 [inline] >> >> >> __x64_sys_sendto+0x130/0x200 net/socket.c:2199 >> >> >> do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 >> >> >> entry_SYSCALL_64_after_hwframe+0x6d/0x75 >> >> >> >> >> >> Fixes: 99e5acae193e ("ipv4: Fix potential uninit variable access bug in __ip_make_skb()") >> >> >> Reported-by: syzkaller <syzkaller@googlegroups.com> >> >> >> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> >> >> >> --- >> >> >> I think IPv6 has a similar issue. If this patch is accepted, I will send >> >> >> a patch for IPv6. >> >> >> --- >> >> >> net/ipv4/ip_output.c | 2 +- >> >> >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> >> >> >> >> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c >> >> >> index 1fe794967211..39229fd0601a 100644 >> >> >> --- a/net/ipv4/ip_output.c >> >> >> +++ b/net/ipv4/ip_output.c >> >> >> @@ -1473,7 +1473,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk, >> >> >> * by icmp_hdr(skb)->type. >> >> >> */ >> >> >> if (sk->sk_type == SOCK_RAW && >> >> >> - !inet_test_bit(HDRINCL, sk)) >> >> >> + !(fl4->flowi4_flags & FLOWI_FLAG_KNOWN_NH)) >> >> >> icmp_type = fl4->fl4_icmp_type; >> >> >> else >> >> >> icmp_type = icmp_hdr(skb)->type; >> >> >> -- >> >> >> 2.44.0 >> >> >> >> >> > >> >> > Thanks for your patch. >> >> > >> >> > I do not think this is enough, as far as syzkaller is concerned. >> >> > >> >> > raw_probe_proto_opt() can leave garbage in fl4_icmp_type (and fl4_icmp_code) >> >> >> >> Thank you for your comment. But I don't understand it clearly. What >> >> exactly do you mean by "garbage"? >> >> >> >> raw_probe_proto_opt() immediately returns 0 if fl4->flowi4_proto is >> >> not IPPROTO_ICMP: >> >> >> >> static int raw_probe_proto_opt(struct raw_frag_vec *rfv, struct flowi4 *fl4) >> >> { >> >> int err; >> >> >> >> if (fl4->flowi4_proto != IPPROTO_ICMP) >> >> return 0; >> >> >> >> In this case, the function doesn't set fl4_icmp_type. Do you mean this >> >> case? >> > >> > There are multiple ways to return early from this function. >> > >> > In all of them, fl4->fl4_icmp_type is left uninitialized, so syzbot >> > will find ways to trigger a related bug, >> > if you assume later that fl4->fl4_icmp_type contains valid (initialized) data. >> >> Thank you for your reply. I see your point. >> >> fl4->fl4_icmp_type is part of flowi_uli union in flowi4 structure, and >> flowi4_init_output() initializes fl4_dport and fl4_sport to zero. >> >> I thought this also initializes fl4_icmp_type and fl4_icmp_code. Do >> you think we should initialize fl4_icmp_type and fl4_icmp_code >> explicitly, otherwise am I misunderstanding? > > Yes, I am precisely saying this : do not rely on some union layout, > without mentioning it in the changelog > or even better in a comment. > > If you want to avoid clearing these fields, please add a > BUILD_BUG_ON() to make sure > a unrelated future change in include/net/flow.h does not break a > hidden assumption. > > (ie : clearing fl4_dport and fl4_sport also clears fl4_icmp_type and > fl4_icmp_code.) Sorry for the late response. Thank you Eric! I got it. I will try to make v2 patch this weekend. Thanks, Shigeru
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 1fe794967211..39229fd0601a 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1473,7 +1473,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk, * by icmp_hdr(skb)->type. */ if (sk->sk_type == SOCK_RAW && - !inet_test_bit(HDRINCL, sk)) + !(fl4->flowi4_flags & FLOWI_FLAG_KNOWN_NH)) icmp_type = fl4->fl4_icmp_type; else icmp_type = icmp_hdr(skb)->type;
KMSAN reported uninit-value access in __ip_make_skb() [1]. __ip_make_skb() tests HDRINCL to know if the skb has icmphdr. However, HDRINCL can cause a race condition. If calling setsockopt(2) with IP_HDRINCL changes HDRINCL while __ip_make_skb() is running, the function will access icmphdr in the skb even if it is not included. This causes the issue reported by KMSAN. Check FLOWI_FLAG_KNOWN_NH on fl4->flowi4_flags instead of testing HDRINCL on the socket. [1] BUG: KMSAN: uninit-value in __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481 ip_finish_skb include/net/ip.h:243 [inline] ip_push_pending_frames+0x4c/0x5c0 net/ipv4/ip_output.c:1508 raw_sendmsg+0x2381/0x2690 net/ipv4/raw.c:654 inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x274/0x3c0 net/socket.c:745 __sys_sendto+0x62c/0x7b0 net/socket.c:2191 __do_sys_sendto net/socket.c:2203 [inline] __se_sys_sendto net/socket.c:2199 [inline] __x64_sys_sendto+0x130/0x200 net/socket.c:2199 do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x6d/0x75 Uninit was created at: slab_post_alloc_hook mm/slub.c:3804 [inline] slab_alloc_node mm/slub.c:3845 [inline] kmem_cache_alloc_node+0x5f6/0xc50 mm/slub.c:3888 kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:577 __alloc_skb+0x35a/0x7c0 net/core/skbuff.c:668 alloc_skb include/linux/skbuff.h:1318 [inline] __ip_append_data+0x49ab/0x68c0 net/ipv4/ip_output.c:1128 ip_append_data+0x1e7/0x260 net/ipv4/ip_output.c:1365 raw_sendmsg+0x22b1/0x2690 net/ipv4/raw.c:648 inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x274/0x3c0 net/socket.c:745 __sys_sendto+0x62c/0x7b0 net/socket.c:2191 __do_sys_sendto net/socket.c:2203 [inline] __se_sys_sendto net/socket.c:2199 [inline] __x64_sys_sendto+0x130/0x200 net/socket.c:2199 do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x6d/0x75 Fixes: 99e5acae193e ("ipv4: Fix potential uninit variable access bug in __ip_make_skb()") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> --- I think IPv6 has a similar issue. If this patch is accepted, I will send a patch for IPv6. --- net/ipv4/ip_output.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)