diff mbox series

[v1,net] kcm: Serialise kcm_sendmsg() for the same socket.

Message ID 20240815220437.69511-1-kuniyu@amazon.com (mailing list archive)
State Accepted
Commit 807067bf014d4a3ae2cc55bd3de16f22a01eb580
Delegated to: Netdev Maintainers
Headers show
Series [v1,net] kcm: Serialise kcm_sendmsg() for the same socket. | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 29 this patch: 29
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 29 this patch: 29
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 29 this patch: 29
netdev/checkpatch warning CHECK: struct mutex definition without comment WARNING: Possible repeated word: 'Google'
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-08-16--21-00 (tests: 710)

Commit Message

Kuniyuki Iwashima Aug. 15, 2024, 10:04 p.m. UTC
syzkaller reported UAF in kcm_release(). [0]

The scenario is

  1. Thread A builds a skb with MSG_MORE and sets kcm->seq_skb.

  2. Thread A resumes building skb from kcm->seq_skb but is blocked
     by sk_stream_wait_memory()

  3. Thread B calls sendmsg() concurrently, finishes building kcm->seq_skb
     and puts the skb to the write queue

  4. Thread A faces an error and finally frees skb that is already in the
     write queue

  5. kcm_release() does double-free the skb in the write queue

When a thread is building a MSG_MORE skb, another thread must not touch it.

Let's add a per-sk mutex and serialise kcm_sendmsg().

[0]:
BUG: KASAN: slab-use-after-free in __skb_unlink include/linux/skbuff.h:2366 [inline]
BUG: KASAN: slab-use-after-free in __skb_dequeue include/linux/skbuff.h:2385 [inline]
BUG: KASAN: slab-use-after-free in __skb_queue_purge_reason include/linux/skbuff.h:3175 [inline]
BUG: KASAN: slab-use-after-free in __skb_queue_purge include/linux/skbuff.h:3181 [inline]
BUG: KASAN: slab-use-after-free in kcm_release+0x170/0x4c8 net/kcm/kcmsock.c:1691
Read of size 8 at addr ffff0000ced0fc80 by task syz-executor329/6167

CPU: 1 PID: 6167 Comm: syz-executor329 Tainted: G    B              6.8.0-rc5-syzkaller-g9abbc24128bc #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call trace:
 dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:291
 show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:298
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xd0/0x124 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x178/0x518 mm/kasan/report.c:488
 kasan_report+0xd8/0x138 mm/kasan/report.c:601
 __asan_report_load8_noabort+0x20/0x2c mm/kasan/report_generic.c:381
 __skb_unlink include/linux/skbuff.h:2366 [inline]
 __skb_dequeue include/linux/skbuff.h:2385 [inline]
 __skb_queue_purge_reason include/linux/skbuff.h:3175 [inline]
 __skb_queue_purge include/linux/skbuff.h:3181 [inline]
 kcm_release+0x170/0x4c8 net/kcm/kcmsock.c:1691
 __sock_release net/socket.c:659 [inline]
 sock_close+0xa4/0x1e8 net/socket.c:1421
 __fput+0x30c/0x738 fs/file_table.c:376
 ____fput+0x20/0x30 fs/file_table.c:404
 task_work_run+0x230/0x2e0 kernel/task_work.c:180
 exit_task_work include/linux/task_work.h:38 [inline]
 do_exit+0x618/0x1f64 kernel/exit.c:871
 do_group_exit+0x194/0x22c kernel/exit.c:1020
 get_signal+0x1500/0x15ec kernel/signal.c:2893
 do_signal+0x23c/0x3b44 arch/arm64/kernel/signal.c:1249
 do_notify_resume+0x74/0x1f4 arch/arm64/kernel/entry-common.c:148
 exit_to_user_mode_prepare arch/arm64/kernel/entry-common.c:169 [inline]
 exit_to_user_mode arch/arm64/kernel/entry-common.c:178 [inline]
 el0_svc+0xac/0x168 arch/arm64/kernel/entry-common.c:713
 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598

Allocated by task 6166:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x40/0x78 mm/kasan/common.c:68
 kasan_save_alloc_info+0x70/0x84 mm/kasan/generic.c:626
 unpoison_slab_object mm/kasan/common.c:314 [inline]
 __kasan_slab_alloc+0x74/0x8c mm/kasan/common.c:340
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook mm/slub.c:3813 [inline]
 slab_alloc_node mm/slub.c:3860 [inline]
 kmem_cache_alloc_node+0x204/0x4c0 mm/slub.c:3903
 __alloc_skb+0x19c/0x3d8 net/core/skbuff.c:641
 alloc_skb include/linux/skbuff.h:1296 [inline]
 kcm_sendmsg+0x1d3c/0x2124 net/kcm/kcmsock.c:783
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg net/socket.c:745 [inline]
 sock_sendmsg+0x220/0x2c0 net/socket.c:768
 splice_to_socket+0x7cc/0xd58 fs/splice.c:889
 do_splice_from fs/splice.c:941 [inline]
 direct_splice_actor+0xec/0x1d8 fs/splice.c:1164
 splice_direct_to_actor+0x438/0xa0c fs/splice.c:1108
 do_splice_direct_actor fs/splice.c:1207 [inline]
 do_splice_direct+0x1e4/0x304 fs/splice.c:1233
 do_sendfile+0x460/0xb3c fs/read_write.c:1295
 __do_sys_sendfile64 fs/read_write.c:1362 [inline]
 __se_sys_sendfile64 fs/read_write.c:1348 [inline]
 __arm64_sys_sendfile64+0x160/0x3b4 fs/read_write.c:1348
 __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
 invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51
 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136
 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155
 el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598

Freed by task 6167:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x40/0x78 mm/kasan/common.c:68
 kasan_save_free_info+0x5c/0x74 mm/kasan/generic.c:640
 poison_slab_object+0x124/0x18c mm/kasan/common.c:241
 __kasan_slab_free+0x3c/0x78 mm/kasan/common.c:257
 kasan_slab_free include/linux/kasan.h:184 [inline]
 slab_free_hook mm/slub.c:2121 [inline]
 slab_free mm/slub.c:4299 [inline]
 kmem_cache_free+0x15c/0x3d4 mm/slub.c:4363
 kfree_skbmem+0x10c/0x19c
 __kfree_skb net/core/skbuff.c:1109 [inline]
 kfree_skb_reason+0x240/0x6f4 net/core/skbuff.c:1144
 kfree_skb include/linux/skbuff.h:1244 [inline]
 kcm_release+0x104/0x4c8 net/kcm/kcmsock.c:1685
 __sock_release net/socket.c:659 [inline]
 sock_close+0xa4/0x1e8 net/socket.c:1421
 __fput+0x30c/0x738 fs/file_table.c:376
 ____fput+0x20/0x30 fs/file_table.c:404
 task_work_run+0x230/0x2e0 kernel/task_work.c:180
 exit_task_work include/linux/task_work.h:38 [inline]
 do_exit+0x618/0x1f64 kernel/exit.c:871
 do_group_exit+0x194/0x22c kernel/exit.c:1020
 get_signal+0x1500/0x15ec kernel/signal.c:2893
 do_signal+0x23c/0x3b44 arch/arm64/kernel/signal.c:1249
 do_notify_resume+0x74/0x1f4 arch/arm64/kernel/entry-common.c:148
 exit_to_user_mode_prepare arch/arm64/kernel/entry-common.c:169 [inline]
 exit_to_user_mode arch/arm64/kernel/entry-common.c:178 [inline]
 el0_svc+0xac/0x168 arch/arm64/kernel/entry-common.c:713
 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598

The buggy address belongs to the object at ffff0000ced0fc80
 which belongs to the cache skbuff_head_cache of size 240
The buggy address is located 0 bytes inside of
 freed 240-byte region [ffff0000ced0fc80, ffff0000ced0fd70)

The buggy address belongs to the physical page:
page:00000000d35f4ae4 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10ed0f
flags: 0x5ffc00000000800(slab|node=0|zone=2|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 05ffc00000000800 ffff0000c1cbf640 fffffdffc3423100 dead000000000004
raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff0000ced0fb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff0000ced0fc00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
>ffff0000ced0fc80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff0000ced0fd00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc
 ffff0000ced0fd80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb

Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Reported-by: syzbot+b72d86aa5df17ce74c60@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b72d86aa5df17ce74c60
Tested-by: syzbot+b72d86aa5df17ce74c60@syzkaller.appspotmail.com
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 include/net/kcm.h | 1 +
 net/kcm/kcmsock.c | 4 ++++
 2 files changed, 5 insertions(+)

Comments

Jason Xing Aug. 16, 2024, 2:56 a.m. UTC | #1
Hello Kuniyuki,

On Fri, Aug 16, 2024 at 6:05 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> syzkaller reported UAF in kcm_release(). [0]
>
> The scenario is
>
>   1. Thread A builds a skb with MSG_MORE and sets kcm->seq_skb.
>
>   2. Thread A resumes building skb from kcm->seq_skb but is blocked
>      by sk_stream_wait_memory()
>
>   3. Thread B calls sendmsg() concurrently, finishes building kcm->seq_skb
>      and puts the skb to the write queue
>
>   4. Thread A faces an error and finally frees skb that is already in the
>      write queue
>
>   5. kcm_release() does double-free the skb in the write queue
>
> When a thread is building a MSG_MORE skb, another thread must not touch it.

Thanks for the analysis.

Since the empty skb (without payload) could cause such race and
double-free issue, I wonder if we can clear the empty skb before
waiting for memory, like what already implemented in
tcp_recvmsg_locked():

commit 72bf4f1767f0386970dc04726dc5bc2e3991dc19
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Oct 19 11:24:57 2023 +0000

    net: do not leave an empty skb in write queue

    Under memory stress conditions, tcp_sendmsg_locked()
    might call sk_stream_wait_memory(), thus releasing the socket lock.

    If a fresh skb has been allocated prior to this,
    we should not leave it in the write queue otherwise
    tcp_write_xmit() could panic.

I'm not pretty sure if I understand correctly.

Thanks,
Jason
Kuniyuki Iwashima Aug. 16, 2024, 3:05 a.m. UTC | #2
From: Jason Xing <kerneljasonxing@gmail.com>
Date: Fri, 16 Aug 2024 10:56:19 +0800
> Hello Kuniyuki,
> 
> On Fri, Aug 16, 2024 at 6:05 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > syzkaller reported UAF in kcm_release(). [0]
> >
> > The scenario is
> >
> >   1. Thread A builds a skb with MSG_MORE and sets kcm->seq_skb.
> >
> >   2. Thread A resumes building skb from kcm->seq_skb but is blocked
> >      by sk_stream_wait_memory()
> >
> >   3. Thread B calls sendmsg() concurrently, finishes building kcm->seq_skb
> >      and puts the skb to the write queue
> >
> >   4. Thread A faces an error and finally frees skb that is already in the
> >      write queue
> >
> >   5. kcm_release() does double-free the skb in the write queue
> >
> > When a thread is building a MSG_MORE skb, another thread must not touch it.
> 
> Thanks for the analysis.
> 
> Since the empty skb (without payload) could cause such race and
> double-free issue, I wonder if we can clear the empty skb before
> waiting for memory,

kcm->seq_skb is set when a part of data is copied to skb, so it's not
empty.  Also, seq_skb is cleared when queued to the write queue.

The problem is one thread referencing kcm->seq_skb goes to sleep and
another thread queues the skb to the write queue.

---8<---
        if (eor) {
                bool not_busy = skb_queue_empty(&sk->sk_write_queue);

                if (head) {
                        /* Message complete, queue it on send buffer */
                        __skb_queue_tail(&sk->sk_write_queue, head);
                        kcm->seq_skb = NULL;
                        KCM_STATS_INCR(kcm->stats.tx_msgs);
                }
...
        } else {
                /* Message not complete, save state */
partial_message:
                if (head) {
                        kcm->seq_skb = head;
                        kcm_tx_msg(head)->last_skb = skb;
                }
---8<---
Jason Xing Aug. 16, 2024, 3:36 a.m. UTC | #3
On Fri, Aug 16, 2024 at 11:05 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From: Jason Xing <kerneljasonxing@gmail.com>
> Date: Fri, 16 Aug 2024 10:56:19 +0800
> > Hello Kuniyuki,
> >
> > On Fri, Aug 16, 2024 at 6:05 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > >
> > > syzkaller reported UAF in kcm_release(). [0]
> > >
> > > The scenario is
> > >
> > >   1. Thread A builds a skb with MSG_MORE and sets kcm->seq_skb.
> > >
> > >   2. Thread A resumes building skb from kcm->seq_skb but is blocked
> > >      by sk_stream_wait_memory()
> > >
> > >   3. Thread B calls sendmsg() concurrently, finishes building kcm->seq_skb
> > >      and puts the skb to the write queue
> > >
> > >   4. Thread A faces an error and finally frees skb that is already in the
> > >      write queue
> > >
> > >   5. kcm_release() does double-free the skb in the write queue
> > >
> > > When a thread is building a MSG_MORE skb, another thread must not touch it.
> >
> > Thanks for the analysis.
> >
> > Since the empty skb (without payload) could cause such race and
> > double-free issue, I wonder if we can clear the empty skb before
> > waiting for memory,
>
> kcm->seq_skb is set when a part of data is copied to skb, so it's not
> empty.  Also, seq_skb is cleared when queued to the write queue.
>
> The problem is one thread referencing kcm->seq_skb goes to sleep and
> another thread queues the skb to the write queue.
>
> ---8<---
>         if (eor) {
>                 bool not_busy = skb_queue_empty(&sk->sk_write_queue);
>
>                 if (head) {
>                         /* Message complete, queue it on send buffer */
>                         __skb_queue_tail(&sk->sk_write_queue, head);
>                         kcm->seq_skb = NULL;
>                         KCM_STATS_INCR(kcm->stats.tx_msgs);
>                 }
> ...
>         } else {
>                 /* Message not complete, save state */
> partial_message:
>                 if (head) {
>                         kcm->seq_skb = head;
>                         kcm_tx_msg(head)->last_skb = skb;
>                 }
> ---8<---

Oh, I see the difference of handling error part after waiting for
memory between tcp_sendmsg_locked and kcm_sendmsg:
In kcm_sendmsg, it could kfree the skb which causes the issue while tcp doesn't.

But I cannot help asking if that lock is a little bit heavy, please
don't get me wrong, I'm not against it. In the meantime, I decided to
take a deep look at the 'out_error' label part.

Thanks,
Jason
Kuniyuki Iwashima Aug. 16, 2024, 3:46 a.m. UTC | #4
From: Jason Xing <kerneljasonxing@gmail.com>
Date: Fri, 16 Aug 2024 11:36:35 +0800
> On Fri, Aug 16, 2024 at 11:05 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > From: Jason Xing <kerneljasonxing@gmail.com>
> > Date: Fri, 16 Aug 2024 10:56:19 +0800
> > > Hello Kuniyuki,
> > >
> > > On Fri, Aug 16, 2024 at 6:05 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > > >
> > > > syzkaller reported UAF in kcm_release(). [0]
> > > >
> > > > The scenario is
> > > >
> > > >   1. Thread A builds a skb with MSG_MORE and sets kcm->seq_skb.
> > > >
> > > >   2. Thread A resumes building skb from kcm->seq_skb but is blocked
> > > >      by sk_stream_wait_memory()
> > > >
> > > >   3. Thread B calls sendmsg() concurrently, finishes building kcm->seq_skb
> > > >      and puts the skb to the write queue
> > > >
> > > >   4. Thread A faces an error and finally frees skb that is already in the
> > > >      write queue
> > > >
> > > >   5. kcm_release() does double-free the skb in the write queue
> > > >
> > > > When a thread is building a MSG_MORE skb, another thread must not touch it.
> > >
> > > Thanks for the analysis.
> > >
> > > Since the empty skb (without payload) could cause such race and
> > > double-free issue, I wonder if we can clear the empty skb before
> > > waiting for memory,
> >
> > kcm->seq_skb is set when a part of data is copied to skb, so it's not
> > empty.  Also, seq_skb is cleared when queued to the write queue.
> >
> > The problem is one thread referencing kcm->seq_skb goes to sleep and
> > another thread queues the skb to the write queue.
> >
> > ---8<---
> >         if (eor) {
> >                 bool not_busy = skb_queue_empty(&sk->sk_write_queue);
> >
> >                 if (head) {
> >                         /* Message complete, queue it on send buffer */
> >                         __skb_queue_tail(&sk->sk_write_queue, head);
> >                         kcm->seq_skb = NULL;
> >                         KCM_STATS_INCR(kcm->stats.tx_msgs);
> >                 }
> > ...
> >         } else {
> >                 /* Message not complete, save state */
> > partial_message:
> >                 if (head) {
> >                         kcm->seq_skb = head;
> >                         kcm_tx_msg(head)->last_skb = skb;
> >                 }
> > ---8<---
> 
> Oh, I see the difference of handling error part after waiting for
> memory between tcp_sendmsg_locked and kcm_sendmsg:
> In kcm_sendmsg, it could kfree the skb which causes the issue while tcp doesn't.
> 
> But I cannot help asking if that lock is a little bit heavy, please
> don't get me wrong, I'm not against it. In the meantime, I decided to
> take a deep look at the 'out_error' label part.

I don't think the mutex is heavy because kcm_sendmsg() is already
serialised with lock_sock().
Jason Xing Aug. 16, 2024, 6:57 a.m. UTC | #5
On Fri, Aug 16, 2024 at 11:47 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From: Jason Xing <kerneljasonxing@gmail.com>
> Date: Fri, 16 Aug 2024 11:36:35 +0800
> > On Fri, Aug 16, 2024 at 11:05 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > >
> > > From: Jason Xing <kerneljasonxing@gmail.com>
> > > Date: Fri, 16 Aug 2024 10:56:19 +0800
> > > > Hello Kuniyuki,
> > > >
> > > > On Fri, Aug 16, 2024 at 6:05 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > > > >
> > > > > syzkaller reported UAF in kcm_release(). [0]
> > > > >
> > > > > The scenario is
> > > > >
> > > > >   1. Thread A builds a skb with MSG_MORE and sets kcm->seq_skb.
> > > > >
> > > > >   2. Thread A resumes building skb from kcm->seq_skb but is blocked
> > > > >      by sk_stream_wait_memory()
> > > > >
> > > > >   3. Thread B calls sendmsg() concurrently, finishes building kcm->seq_skb
> > > > >      and puts the skb to the write queue
> > > > >
> > > > >   4. Thread A faces an error and finally frees skb that is already in the
> > > > >      write queue
> > > > >
> > > > >   5. kcm_release() does double-free the skb in the write queue
> > > > >
> > > > > When a thread is building a MSG_MORE skb, another thread must not touch it.
> > > >
> > > > Thanks for the analysis.
> > > >
> > > > Since the empty skb (without payload) could cause such race and
> > > > double-free issue, I wonder if we can clear the empty skb before
> > > > waiting for memory,
> > >
> > > kcm->seq_skb is set when a part of data is copied to skb, so it's not
> > > empty.  Also, seq_skb is cleared when queued to the write queue.
> > >
> > > The problem is one thread referencing kcm->seq_skb goes to sleep and
> > > another thread queues the skb to the write queue.
> > >
> > > ---8<---
> > >         if (eor) {
> > >                 bool not_busy = skb_queue_empty(&sk->sk_write_queue);
> > >
> > >                 if (head) {
> > >                         /* Message complete, queue it on send buffer */
> > >                         __skb_queue_tail(&sk->sk_write_queue, head);
> > >                         kcm->seq_skb = NULL;
> > >                         KCM_STATS_INCR(kcm->stats.tx_msgs);
> > >                 }
> > > ...
> > >         } else {
> > >                 /* Message not complete, save state */
> > > partial_message:
> > >                 if (head) {
> > >                         kcm->seq_skb = head;
> > >                         kcm_tx_msg(head)->last_skb = skb;
> > >                 }
> > > ---8<---
> >
> > Oh, I see the difference of handling error part after waiting for
> > memory between tcp_sendmsg_locked and kcm_sendmsg:
> > In kcm_sendmsg, it could kfree the skb which causes the issue while tcp doesn't.
> >
> > But I cannot help asking if that lock is a little bit heavy, please
> > don't get me wrong, I'm not against it. In the meantime, I decided to
> > take a deep look at the 'out_error' label part.
>
> I don't think the mutex is heavy because kcm_sendmsg() is already
> serialised with lock_sock().

It makes sense. I have to say that it was my concern.

After digging into this part, sorry, I can't find a easy way to
prevent double-free issues because:
initially I was trying using seq_skb (something like that) as an
indicator to reflect whether we are allowed to kfree the skb, but it
doesn't work for all the cases. Supposing there are three threads,
each of them can call kcm_sendmsg() and wait, which makes it more
complicated. Let alone more threads access this function and try to
grab the lock nearly concurrently.

Making the whole process serialised can make life easier. For sure,
introducing mutex locks can solve the issue.

Thanks,
Jason
Eric Dumazet Aug. 19, 2024, 3:56 p.m. UTC | #6
On Fri, Aug 16, 2024 at 12:04 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> syzkaller reported UAF in kcm_release(). [0]
>
> The scenario is
>
>   1. Thread A builds a skb with MSG_MORE and sets kcm->seq_skb.
>
>   2. Thread A resumes building skb from kcm->seq_skb but is blocked
>      by sk_stream_wait_memory()
>
>   3. Thread B calls sendmsg() concurrently, finishes building kcm->seq_skb
>      and puts the skb to the write queue
>
>   4. Thread A faces an error and finally frees skb that is already in the
>      write queue
>
>   5. kcm_release() does double-free the skb in the write queue
>
> When a thread is building a MSG_MORE skb, another thread must not touch it.
>
> Let's add a per-sk mutex and serialise kcm_sendmsg().
>
>
> Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
> Reported-by: syzbot+b72d86aa5df17ce74c60@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b72d86aa5df17ce74c60
> Tested-by: syzbot+b72d86aa5df17ce74c60@syzkaller.appspotmail.com
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---

Reviewed-by: Eric Dumazet <edumazet@google.com>

I wonder if anyone is using KCM....
Kuniyuki Iwashima Aug. 19, 2024, 6:02 p.m. UTC | #7
From: Eric Dumazet <edumazet@google.com>
Date: Mon, 19 Aug 2024 17:56:55 +0200
> On Fri, Aug 16, 2024 at 12:04 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > syzkaller reported UAF in kcm_release(). [0]
> >
> > The scenario is
> >
> >   1. Thread A builds a skb with MSG_MORE and sets kcm->seq_skb.
> >
> >   2. Thread A resumes building skb from kcm->seq_skb but is blocked
> >      by sk_stream_wait_memory()
> >
> >   3. Thread B calls sendmsg() concurrently, finishes building kcm->seq_skb
> >      and puts the skb to the write queue
> >
> >   4. Thread A faces an error and finally frees skb that is already in the
> >      write queue
> >
> >   5. kcm_release() does double-free the skb in the write queue
> >
> > When a thread is building a MSG_MORE skb, another thread must not touch it.
> >
> > Let's add a per-sk mutex and serialise kcm_sendmsg().
> >
> >
> > Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
> > Reported-by: syzbot+b72d86aa5df17ce74c60@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=b72d86aa5df17ce74c60
> > Tested-by: syzbot+b72d86aa5df17ce74c60@syzkaller.appspotmail.com
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > ---
> 
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> 
> I wonder if anyone is using KCM....

Same impression :)
Maybe it's time to remove KCM.
https://lore.kernel.org/netdev/CALx6S37hSfQWw3Rku8vsavn8ejk0fndRk+=-=73gU7G-RbnK8Q@mail.gmail.com/
patchwork-bot+netdevbpf@kernel.org Aug. 20, 2024, 2:20 a.m. UTC | #8
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 15 Aug 2024 15:04:37 -0700 you wrote:
> syzkaller reported UAF in kcm_release(). [0]
> 
> The scenario is
> 
>   1. Thread A builds a skb with MSG_MORE and sets kcm->seq_skb.
> 
>   2. Thread A resumes building skb from kcm->seq_skb but is blocked
>      by sk_stream_wait_memory()
> 
> [...]

Here is the summary with links:
  - [v1,net] kcm: Serialise kcm_sendmsg() for the same socket.
    https://git.kernel.org/netdev/net/c/807067bf014d

You are awesome, thank you!
diff mbox series

Patch

diff --git a/include/net/kcm.h b/include/net/kcm.h
index 90279e5e09a5..441e993be634 100644
--- a/include/net/kcm.h
+++ b/include/net/kcm.h
@@ -70,6 +70,7 @@  struct kcm_sock {
 	struct work_struct tx_work;
 	struct list_head wait_psock_list;
 	struct sk_buff *seq_skb;
+	struct mutex tx_mutex;
 	u32 tx_stopped : 1;
 
 	/* Don't use bit fields here, these are set under different locks */
diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index 2f191e50d4fc..d4118c796290 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -755,6 +755,7 @@  static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 		  !(msg->msg_flags & MSG_MORE) : !!(msg->msg_flags & MSG_EOR);
 	int err = -EPIPE;
 
+	mutex_lock(&kcm->tx_mutex);
 	lock_sock(sk);
 
 	/* Per tcp_sendmsg this should be in poll */
@@ -926,6 +927,7 @@  static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	KCM_STATS_ADD(kcm->stats.tx_bytes, copied);
 
 	release_sock(sk);
+	mutex_unlock(&kcm->tx_mutex);
 	return copied;
 
 out_error:
@@ -951,6 +953,7 @@  static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 		sk->sk_write_space(sk);
 
 	release_sock(sk);
+	mutex_unlock(&kcm->tx_mutex);
 	return err;
 }
 
@@ -1204,6 +1207,7 @@  static void init_kcm_sock(struct kcm_sock *kcm, struct kcm_mux *mux)
 	spin_unlock_bh(&mux->lock);
 
 	INIT_WORK(&kcm->tx_work, kcm_tx_work);
+	mutex_init(&kcm->tx_mutex);
 
 	spin_lock_bh(&mux->rx_lock);
 	kcm_rcv_ready(kcm);