diff mbox series

[net,v2,1/2] vsock: Fix memory leak in vsock_connect()

Message ID a02c6e7e3135473d254ac97abc603d963ba8f716.1659862577.git.peilin.ye@bytedance.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net,v2,1/2] vsock: Fix memory leak in vsock_connect() | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 10 of 10 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning WARNING: line length of 85 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Peilin Ye Aug. 7, 2022, 9 a.m. UTC
From: Peilin Ye <peilin.ye@bytedance.com>

An O_NONBLOCK vsock_connect() request may try to reschedule
@connect_work.  Imagine the following sequence of vsock_connect()
requests:

  1. The 1st, non-blocking request schedules @connect_work, which will
     expire after 200 jiffies.  Socket state is now SS_CONNECTING;

  2. Later, the 2nd, blocking request gets interrupted by a signal after
     a few jiffies while waiting for the connection to be established.
     Socket state is back to SS_UNCONNECTED, but @connect_work is still
     pending, and will expire after 100 jiffies.

  3. Now, the 3rd, non-blocking request tries to schedule @connect_work
     again.  Since @connect_work is already scheduled,
     schedule_delayed_work() silently returns.  sock_hold() is called
     twice, but sock_put() will only be called once in
     vsock_connect_timeout(), causing a memory leak reported by syzbot:

  BUG: memory leak
  unreferenced object 0xffff88810ea56a40 (size 1232):
    comm "syz-executor756", pid 3604, jiffies 4294947681 (age 12.350s)
    hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      28 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00  (..@............
    backtrace:
      [<ffffffff837c830e>] sk_prot_alloc+0x3e/0x1b0 net/core/sock.c:1930
      [<ffffffff837cbe22>] sk_alloc+0x32/0x2e0 net/core/sock.c:1989
      [<ffffffff842ccf68>] __vsock_create.constprop.0+0x38/0x320 net/vmw_vsock/af_vsock.c:734
      [<ffffffff842ce8f1>] vsock_create+0xc1/0x2d0 net/vmw_vsock/af_vsock.c:2203
      [<ffffffff837c0cbb>] __sock_create+0x1ab/0x2b0 net/socket.c:1468
      [<ffffffff837c3acf>] sock_create net/socket.c:1519 [inline]
      [<ffffffff837c3acf>] __sys_socket+0x6f/0x140 net/socket.c:1561
      [<ffffffff837c3bba>] __do_sys_socket net/socket.c:1570 [inline]
      [<ffffffff837c3bba>] __se_sys_socket net/socket.c:1568 [inline]
      [<ffffffff837c3bba>] __x64_sys_socket+0x1a/0x20 net/socket.c:1568
      [<ffffffff84512815>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      [<ffffffff84512815>] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80
      [<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae
  <...>

Use mod_delayed_work() instead: if @connect_work is already scheduled,
reschedule it, and undo sock_hold() to keep the reference count
balanced.

Reported-and-tested-by: syzbot+b03f55bf128f9a38f064@syzkaller.appspotmail.com
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Co-developed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
---
change since v1:
  - merged with Stefano's patch [1]

[1] https://gitlab.com/sgarzarella/linux/-/commit/2d0f0b9cbbb30d58fdcbca7c1a857fd8f3110d61

Hi Stefano,

About the Fixes: tag, [2] introduced @connect_work, but all it did was
breaking @dwork into two and moving some INIT_DELAYED_WORK()'s, so I don't
think [2] introduced this memory leak?

Since [2] has already been backported to 4.9 and 4.14, I think we can
Fixes: commit d021c344051a ("VSOCK: Introduce VM Sockets"), too, to make
backporting easier?

[2] commit 455f05ecd2b2 ("vsock: split dwork to avoid reinitializations")

Thanks,
Peilin Ye

 net/vmw_vsock/af_vsock.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Stefano Garzarella Aug. 8, 2022, 7:55 a.m. UTC | #1
On Sun, Aug 07, 2022 at 02:00:11AM -0700, Peilin Ye wrote:
>From: Peilin Ye <peilin.ye@bytedance.com>
>
>An O_NONBLOCK vsock_connect() request may try to reschedule
>@connect_work.  Imagine the following sequence of vsock_connect()
>requests:
>
>  1. The 1st, non-blocking request schedules @connect_work, which will
>     expire after 200 jiffies.  Socket state is now SS_CONNECTING;
>
>  2. Later, the 2nd, blocking request gets interrupted by a signal after
>     a few jiffies while waiting for the connection to be established.
>     Socket state is back to SS_UNCONNECTED, but @connect_work is still
>     pending, and will expire after 100 jiffies.
>
>  3. Now, the 3rd, non-blocking request tries to schedule @connect_work
>     again.  Since @connect_work is already scheduled,
>     schedule_delayed_work() silently returns.  sock_hold() is called
>     twice, but sock_put() will only be called once in
>     vsock_connect_timeout(), causing a memory leak reported by syzbot:
>
>  BUG: memory leak
>  unreferenced object 0xffff88810ea56a40 (size 1232):
>    comm "syz-executor756", pid 3604, jiffies 4294947681 (age 12.350s)
>    hex dump (first 32 bytes):
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>      28 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00  (..@............
>    backtrace:
>      [<ffffffff837c830e>] sk_prot_alloc+0x3e/0x1b0 net/core/sock.c:1930
>      [<ffffffff837cbe22>] sk_alloc+0x32/0x2e0 net/core/sock.c:1989
>      [<ffffffff842ccf68>] __vsock_create.constprop.0+0x38/0x320 net/vmw_vsock/af_vsock.c:734
>      [<ffffffff842ce8f1>] vsock_create+0xc1/0x2d0 net/vmw_vsock/af_vsock.c:2203
>      [<ffffffff837c0cbb>] __sock_create+0x1ab/0x2b0 net/socket.c:1468
>      [<ffffffff837c3acf>] sock_create net/socket.c:1519 [inline]
>      [<ffffffff837c3acf>] __sys_socket+0x6f/0x140 net/socket.c:1561
>      [<ffffffff837c3bba>] __do_sys_socket net/socket.c:1570 [inline]
>      [<ffffffff837c3bba>] __se_sys_socket net/socket.c:1568 [inline]
>      [<ffffffff837c3bba>] __x64_sys_socket+0x1a/0x20 net/socket.c:1568
>      [<ffffffff84512815>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>      [<ffffffff84512815>] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80
>      [<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae
>  <...>
>
>Use mod_delayed_work() instead: if @connect_work is already scheduled,
>reschedule it, and undo sock_hold() to keep the reference count
>balanced.
>
>Reported-and-tested-by: syzbot+b03f55bf128f9a38f064@syzkaller.appspotmail.com
>Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
>Co-developed-by: Stefano Garzarella <sgarzare@redhat.com>
>Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
>---
>change since v1:
>  - merged with Stefano's patch [1]
>
>[1] https://gitlab.com/sgarzarella/linux/-/commit/2d0f0b9cbbb30d58fdcbca7c1a857fd8f3110d61
>
>Hi Stefano,
>
>About the Fixes: tag, [2] introduced @connect_work, but all it did was
>breaking @dwork into two and moving some INIT_DELAYED_WORK()'s, so I don't
>think [2] introduced this memory leak?
>
>Since [2] has already been backported to 4.9 and 4.14, I think we can
>Fixes: commit d021c344051a ("VSOCK: Introduce VM Sockets"), too, to make
>backporting easier?

Yep, I think it should be fine!

>
>[2] commit 455f05ecd2b2 ("vsock: split dwork to avoid reinitializations")
>
>Thanks,
>Peilin Ye
>
> net/vmw_vsock/af_vsock.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index f04abf662ec6..fe14f6cbca22 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1391,7 +1391,13 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
> 			 * timeout fires.
> 			 */
> 			sock_hold(sk);
>-			schedule_delayed_work(&vsk->connect_work, timeout);
>+
>+			/* If the timeout function is already scheduled,
>+			 * reschedule it, then ungrab the socket refcount to
>+			 * keep it balanced.
>+			 */
>+			if (mod_delayed_work(system_wq, &vsk->connect_work, timeout))
                             ^
Checkpatch warns here about line lenght.
If you have to re-send, please split it.

Anyway, the patch LGTM:

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

Thanks,
Stefano
Peilin Ye Aug. 8, 2022, 5:45 p.m. UTC | #2
On Mon, Aug 08, 2022 at 09:55:33AM +0200, Stefano Garzarella wrote:
> On Sun, Aug 07, 2022 at 02:00:11AM -0700, Peilin Ye wrote:
> > net/vmw_vsock/af_vsock.c | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> > index f04abf662ec6..fe14f6cbca22 100644
> > --- a/net/vmw_vsock/af_vsock.c
> > +++ b/net/vmw_vsock/af_vsock.c
> > @@ -1391,7 +1391,13 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
> > 			 * timeout fires.
> > 			 */
> > 			sock_hold(sk);
> > -			schedule_delayed_work(&vsk->connect_work, timeout);
> > +
> > +			/* If the timeout function is already scheduled,
> > +			 * reschedule it, then ungrab the socket refcount to
> > +			 * keep it balanced.
> > +			 */
> > +			if (mod_delayed_work(system_wq, &vsk->connect_work, timeout))
>                             ^
> Checkpatch warns here about line lenght.
> If you have to re-send, please split it.

Oh, net-next HEAD's checkpatch --strict didn't complain, I didn't know
Patchwork checks 80 columns.  I will send v3 soon.

> Anyway, the patch LGTM:
> 
> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

Thanks!

Peilin Ye
diff mbox series

Patch

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index f04abf662ec6..fe14f6cbca22 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1391,7 +1391,13 @@  static int vsock_connect(struct socket *sock, struct sockaddr *addr,
 			 * timeout fires.
 			 */
 			sock_hold(sk);
-			schedule_delayed_work(&vsk->connect_work, timeout);
+
+			/* If the timeout function is already scheduled,
+			 * reschedule it, then ungrab the socket refcount to
+			 * keep it balanced.
+			 */
+			if (mod_delayed_work(system_wq, &vsk->connect_work, timeout))
+				sock_put(sk);
 
 			/* Skip ahead to preserve error code set above. */
 			goto out_wait;