Message ID | 20220804020925.32167-1-yepeilin.cs@gmail.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC,net-next] vsock: Reschedule connect_work for O_NONBLOCK connect() requests | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Clearly marked for net-next |
netdev/apply | fail | Patch does not apply to net-next |
On Wed, Aug 03, 2022 at 07:09:25PM -0700, Peilin Ye wrote: >From: Peilin Ye <peilin.ye@bytedance.com> > >An O_NONBLOCK vsock_connect() request may try to reschedule >@connect_work. Consider the following vsock_connect() requests: > > 1. The 1st, non-blocking request schedules @connect_work, which will > expire after, say, 200 jiffies. Socket state is now SS_CONNECTING; > > 2. Later, the 2nd, blocking request gets interrupted by a signal after > 5 jiffies while waiting for the connection to be established. > Socket state is back to SS_UNCONNECTED, and @connect_work will > expire after 100 jiffies; > > 3. Now, the 3rd, non-blocking request tries to schedule @connect_work > again, but @connect_work has already been scheduled, and will > expire in, say, 50 jiffies. > >In this scenario, currently this 3rd request simply decreases the sock >reference count and returns. Instead, let it reschedules @connect_work >and resets the timeout back to @connect_timeout. > >Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> >--- >Hi all, > >This patch is RFC because it bases on Stefano's WIP fix [1] for a bug >[2] >reported by syzbot, and it won't apply on current net-next. I think it >solves a separate issue. Nice, this is better! Feel free to include my patch in this (inclunding also the Fixes tag and maybe senidng to syzbot and including its tag as well). The last thing I was trying to figure out before sending the patch was whether to set sock->state = SS_UNCONNECTED in vsock_connect_timeout(). I think we should do that, otherwise a subsequent to connect() with O_NONBLOCK set would keep returning -EALREADY, even though the timeout has expired. What do you think? I don't think it changes anything for the bug raised by sysbot, so it could be a separate patch. Thanks, Stefano > >Please advise, thanks! >Peilin Ye > >[1] https://gitlab.com/sgarzarella/linux/-/commit/2d0f0b9cbbb30d58fdcbca7c1a857fd8f3110d61 >[2] https://syzkaller.appspot.com/bug?id=cd9103dc63346d26acbbdbf5c6ba9bd74e48c860 > > net/vmw_vsock/af_vsock.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > >diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c >index 194d22291d8b..417e4ad17c03 100644 >--- a/net/vmw_vsock/af_vsock.c >+++ b/net/vmw_vsock/af_vsock.c >@@ -1395,7 +1395,7 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, > /* If the timeout function is already scheduled, ungrab > * the socket refcount to not leave it unbalanced. > */ >- if (!schedule_delayed_work(&vsk->connect_work, timeout)) >+ if (mod_delayed_work(system_wq, &vsk->connect_work, timeout)) > sock_put(sk); > > /* Skip ahead to preserve error code set above. */ >-- >2.20.1 >
Hi Stefano, On Thu, Aug 04, 2022 at 08:59:23AM +0200, Stefano Garzarella wrote: > The last thing I was trying to figure out before sending the patch was > whether to set sock->state = SS_UNCONNECTED in vsock_connect_timeout(). > > I think we should do that, otherwise a subsequent to connect() with > O_NONBLOCK set would keep returning -EALREADY, even though the timeout has > expired. > > What do you think? Thanks for bringing this up, after thinking about sock->state, I have 3 thoughts: 1. I think the root cause of this memleak is, we keep @connect_work pending, even after the 2nd, blocking request times out (or gets interrupted) and sets sock->state back to SS_UNCONNECTED. @connect_work is effectively no-op when sk->sk_state is TCP_CLOS{E,ING} anyway, so why not we just cancel @connect_work when blocking requests time out or get interrupted? Something like: diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index f04abf662ec6..62628af84164 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1402,6 +1402,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, lock_sock(sk); if (signal_pending(current)) { + if (cancel_delayed_work(&vsk->connect_work)) + sock_put(sk); + err = sock_intr_errno(timeout); sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; sock->state = SS_UNCONNECTED; @@ -1409,6 +1412,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, vsock_remove_connected(vsk); goto out_wait; } else if (timeout == 0) { + if (cancel_delayed_work(&vsk->connect_work)) + sock_put(sk); + err = -ETIMEDOUT; sk->sk_state = TCP_CLOSE; sock->state = SS_UNCONNECTED; Then no need to worry about rescheduling @connect_work, and the state machine becomes more accurate. What do you think? I will ask syzbot to test this. 2. About your suggestion of setting sock->state = SS_UNCONNECTED in vsock_connect_timeout(), I think it makes sense. Are you going to send a net-next patch for this? 3. After a TCP_SYN_SENT sock receives VIRTIO_VSOCK_OP_RESPONSE in virtio_transport_recv_connecting(), why don't we cancel @connect_work? Am I missing something? Thanks, Peilin Ye
On Thu, Aug 04, 2022 at 04:44:47PM -0700, Peilin Ye wrote: >Hi Stefano, > >On Thu, Aug 04, 2022 at 08:59:23AM +0200, Stefano Garzarella wrote: >> The last thing I was trying to figure out before sending the patch was >> whether to set sock->state = SS_UNCONNECTED in vsock_connect_timeout(). >> >> I think we should do that, otherwise a subsequent to connect() with >> O_NONBLOCK set would keep returning -EALREADY, even though the timeout has >> expired. >> >> What do you think? > >Thanks for bringing this up, after thinking about sock->state, I have 3 >thoughts: > >1. I think the root cause of this memleak is, we keep @connect_work > pending, even after the 2nd, blocking request times out (or gets > interrupted) and sets sock->state back to SS_UNCONNECTED. > > @connect_work is effectively no-op when sk->sk_state is > TCP_CLOS{E,ING} anyway, so why not we just cancel @connect_work when > blocking requests time out or get interrupted? Something like: > >diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c >index f04abf662ec6..62628af84164 100644 >--- a/net/vmw_vsock/af_vsock.c >+++ b/net/vmw_vsock/af_vsock.c >@@ -1402,6 +1402,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, > lock_sock(sk); > > if (signal_pending(current)) { >+ if (cancel_delayed_work(&vsk->connect_work)) >+ sock_put(sk); >+ > err = sock_intr_errno(timeout); > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > sock->state = SS_UNCONNECTED; >@@ -1409,6 +1412,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, > vsock_remove_connected(vsk); > goto out_wait; > } else if (timeout == 0) { >+ if (cancel_delayed_work(&vsk->connect_work)) >+ sock_put(sk); >+ > err = -ETIMEDOUT; > sk->sk_state = TCP_CLOSE; > sock->state = SS_UNCONNECTED; > > Then no need to worry about rescheduling @connect_work, and the state > machine becomes more accurate. What do you think? I will ask syzbot > to test this. It could work, but should we set `sk->sk_err` and call sk_error_report() to wake up thread waiting on poll()? Maybe the previous version is simpler. > >2. About your suggestion of setting sock->state = SS_UNCONNECTED in > vsock_connect_timeout(), I think it makes sense. Are you going to > send a net-next patch for this? If you have time, feel free to send it. Since it is a fix, I believe you can use the "net" tree. (Also for this patch). Remember to put the "Fixes" tag that should be the same. > >3. After a TCP_SYN_SENT sock receives VIRTIO_VSOCK_OP_RESPONSE in > virtio_transport_recv_connecting(), why don't we cancel > @connect_work? > Am I missing something? Because when the timeout will fire, vsock_connect_timeout() will just call sock_put() since sk->sk_state is changed. Of course, we can cancel it if we want, but I think it's not worth it. In the end, this rescheduling patch should solve all the problems. Thanks, Stefano
On Fri, Aug 05, 2022 at 02:42:39PM +0200, Stefano Garzarella wrote: > On Thu, Aug 04, 2022 at 04:44:47PM -0700, Peilin Ye wrote: > > 1. I think the root cause of this memleak is, we keep @connect_work > > pending, even after the 2nd, blocking request times out (or gets > > interrupted) and sets sock->state back to SS_UNCONNECTED. > > > > @connect_work is effectively no-op when sk->sk_state is > > TCP_CLOS{E,ING} anyway, so why not we just cancel @connect_work when > > blocking requests time out or get interrupted? Something like: > > > > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c > > index f04abf662ec6..62628af84164 100644 > > --- a/net/vmw_vsock/af_vsock.c > > +++ b/net/vmw_vsock/af_vsock.c > > @@ -1402,6 +1402,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, > > lock_sock(sk); > > > > if (signal_pending(current)) { > > + if (cancel_delayed_work(&vsk->connect_work)) > > + sock_put(sk); > > + > > err = sock_intr_errno(timeout); > > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > > sock->state = SS_UNCONNECTED; > > @@ -1409,6 +1412,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, > > vsock_remove_connected(vsk); > > goto out_wait; > > } else if (timeout == 0) { > > + if (cancel_delayed_work(&vsk->connect_work)) > > + sock_put(sk); > > + > > err = -ETIMEDOUT; > > sk->sk_state = TCP_CLOSE; > > sock->state = SS_UNCONNECTED; > > > > Then no need to worry about rescheduling @connect_work, and the state > > machine becomes more accurate. What do you think? I will ask syzbot > > to test this. > > It could work, but should we set `sk->sk_err` and call sk_error_report() to > wake up thread waiting on poll()? > > Maybe the previous version is simpler. Right, I forgot about sk_error_report(). Let us use the simpler version then. > > 2. About your suggestion of setting sock->state = SS_UNCONNECTED in > > vsock_connect_timeout(), I think it makes sense. Are you going to > > send a net-next patch for this? > > If you have time, feel free to send it. > > Since it is a fix, I believe you can use the "net" tree. (Also for this > patch). > > Remember to put the "Fixes" tag that should be the same. Sure, I will send them this week. Thanks! Peilin Ye
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 194d22291d8b..417e4ad17c03 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1395,7 +1395,7 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, /* If the timeout function is already scheduled, ungrab * the socket refcount to not leave it unbalanced. */ - if (!schedule_delayed_work(&vsk->connect_work, timeout)) + if (mod_delayed_work(system_wq, &vsk->connect_work, timeout)) sock_put(sk); /* Skip ahead to preserve error code set above. */