Message ID | 5e26963328ae04aa375089a713ec40f5eb6adacd.1662051551.git.pabeni@redhat.com (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Matthieu Baerts |
Headers | show |
Series | [v3,mptcp-next,1/3] mptcp: propagate fastclose error. | expand |
Hi Paolo, On 01/09/2022 19:31, Paolo Abeni wrote: > When an mptcp socket is closed due to an incoming FASTCLOSE > option, so specific sk_err is set and later syscall will > fail usually with EPIPE. > > Align the current fastclose error handling with TCP reset, > properly setting the socket error according to the current > msk state and propagating such error. Thank you for the patch and sorry for the delay! I have some questions here below: > Additionally sendmsg() is currently not handling properly > the sk_err, always returning EPIPE. Could it be seen as a bug-fix and deserving a dedicated commit? > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > --- > net/mptcp/protocol.c | 45 ++++++++++++++++++++++++++++++++++---------- > 1 file changed, 35 insertions(+), 10 deletions(-) > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c > index f4891db86217..b04f184695e4 100644 > --- a/net/mptcp/protocol.c > +++ b/net/mptcp/protocol.c > @@ -1685,9 +1685,13 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) > if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) { > ret = sk_stream_wait_connect(sk, &timeo); > if (ret) > - goto out; > + goto do_error; > } > > + ret = -EPIPE; Regarding this line above... (see below) > + if (unlikely(sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))) > + goto do_error; > + > pfrag = sk_page_frag(sk); > > while (msg_data_left(msg)) { ...Could we eventually have no data left and no going through the while loop? In this case "copied" will be 0 and we will return the new -EPIPE, maybe not what we want, no? Or should we change 'ret' value just before calling 'goto do_error' to avoid that? (...) > @@ -2404,12 +2410,31 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk) > unlock_sock_fast(tcp_sk, slow); > } > > + /* Mirror the tcp_reset() error propagation */ > + switch (sk->sk_state) { > + case TCP_SYN_SENT: > + sk->sk_err = ECONNREFUSED; > + break; > + case TCP_CLOSE_WAIT: > + sk->sk_err = EPIPE; > + break; > + case TCP_CLOSE: > + return; > + default: > + sk->sk_err = ECONNRESET; > + } Should we eventually move this out of tcp_reset(), export it (inline function?) and re-using it here to avoid the duplication (and eventually being out-of-sync in case of changes on TCP side? even if I guess this would not change but well) > + > inet_sk_state_store(sk, TCP_CLOSE); > sk->sk_shutdown = SHUTDOWN_MASK; > smp_mb__before_atomic(); /* SHUTDOWN must be visible first */ > set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags); > > - mptcp_close_wake_up(sk); > + /* the calling mptcp_worker will properly destroy the socket */ > + if (sock_flag(sk, SOCK_DEAD)) > + return; > + > + sk->sk_state_change(sk); > + sk_error_report(sk); > } > > static void __mptcp_retrans(struct sock *sk) Cheers, Matt
On Wed, 2022-09-07 at 19:22 +0200, Matthieu Baerts wrote: > Hi Paolo, > > On 01/09/2022 19:31, Paolo Abeni wrote: > > When an mptcp socket is closed due to an incoming FASTCLOSE > > option, so specific sk_err is set and later syscall will > > fail usually with EPIPE. > > > > Align the current fastclose error handling with TCP reset, > > properly setting the socket error according to the current > > msk state and propagating such error. > > Thank you for the patch and sorry for the delay! Thank you for the review! > > I have some questions here below: > > > Additionally sendmsg() is currently not handling properly > > the sk_err, always returning EPIPE. > Could it be seen as a bug-fix and deserving a dedicated commit? I personally see this as an improvement, as AFAIK there is no mandated by RFC return error code for MPTCP. > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > --- > > net/mptcp/protocol.c | 45 ++++++++++++++++++++++++++++++++++---------- > > 1 file changed, 35 insertions(+), 10 deletions(-) > > > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c > > index f4891db86217..b04f184695e4 100644 > > --- a/net/mptcp/protocol.c > > +++ b/net/mptcp/protocol.c > > @@ -1685,9 +1685,13 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) > > if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) { > > ret = sk_stream_wait_connect(sk, &timeo); > > if (ret) > > - goto out; > > + goto do_error; > > } > > > > + ret = -EPIPE; > > Regarding this line above... (see below) > > > + if (unlikely(sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))) > > + goto do_error; > > + > > pfrag = sk_page_frag(sk); > > > > while (msg_data_left(msg)) { > > ...Could we eventually have no data left and no going through the while > loop? yes... > In this case "copied" will be 0 and we will return the new -EPIPE, maybe > not what we want, no? no;) we now always return 'copied'. We set 'copied' to the correct error code only in the error path, saving a conditional in the fast path. All gracefully stolen from plain TCP ;) > Or should we change 'ret' value just before calling 'goto do_error' to > avoid that? > > (...) > > > @@ -2404,12 +2410,31 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk) > > unlock_sock_fast(tcp_sk, slow); > > } > > > > + /* Mirror the tcp_reset() error propagation */ > > + switch (sk->sk_state) { > > + case TCP_SYN_SENT: > > + sk->sk_err = ECONNREFUSED; > > + break; > > + case TCP_CLOSE_WAIT: > > + sk->sk_err = EPIPE; > > + break; > > + case TCP_CLOSE: > > + return; > > + default: > > + sk->sk_err = ECONNRESET; > > + } > > Should we eventually move this out of tcp_reset(), export it (inline > function?) and re-using it here to avoid the duplication (and eventually > being out-of-sync in case of changes on TCP side? even if I guess this > would not change but well) That could be a possible follow-up I guess, but I would refrain from touching TCP to address issues/295. Thanks! Paolo
Hi Paolo, Thank you for your replies! On 08/09/2022 13:48, Paolo Abeni wrote: > On Wed, 2022-09-07 at 19:22 +0200, Matthieu Baerts wrote: >> On 01/09/2022 19:31, Paolo Abeni wrote: (...) >>> Additionally sendmsg() is currently not handling properly >>> the sk_err, always returning EPIPE. >> Could it be seen as a bug-fix and deserving a dedicated commit? > > I personally see this as an improvement, as AFAIK there is no mandated > by RFC return error code for MPTCP. Mmh yes I see, we can say that. >> >>> Signed-off-by: Paolo Abeni <pabeni@redhat.com> >>> --- >>> net/mptcp/protocol.c | 45 ++++++++++++++++++++++++++++++++++---------- >>> 1 file changed, 35 insertions(+), 10 deletions(-) >>> >>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c >>> index f4891db86217..b04f184695e4 100644 >>> --- a/net/mptcp/protocol.c >>> +++ b/net/mptcp/protocol.c >>> @@ -1685,9 +1685,13 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) >>> if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) { >>> ret = sk_stream_wait_connect(sk, &timeo); >>> if (ret) >>> - goto out; >>> + goto do_error; >>> } >>> >>> + ret = -EPIPE; >> >> Regarding this line above... (see below) >> >>> + if (unlikely(sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))) >>> + goto do_error; >>> + >>> pfrag = sk_page_frag(sk); >>> >>> while (msg_data_left(msg)) { >> >> ...Could we eventually have no data left and no going through the while >> loop? > > yes... > >> In this case "copied" will be 0 and we will return the new -EPIPE, maybe >> not what we want, no? > > no;) we now always return 'copied'. We set 'copied' to the correct > error code only in the error path, saving a conditional in the fast > path. My bad, for the end of the function, I was looking at code prior this patch... Indeed, all good with the rest of the modifications :) > All gracefully stolen from plain TCP ;) :) > >> Or should we change 'ret' value just before calling 'goto do_error' to >> avoid that? >> >> (...) >> >>> @@ -2404,12 +2410,31 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk) >>> unlock_sock_fast(tcp_sk, slow); >>> } >>> >>> + /* Mirror the tcp_reset() error propagation */ >>> + switch (sk->sk_state) { >>> + case TCP_SYN_SENT: >>> + sk->sk_err = ECONNREFUSED; >>> + break; >>> + case TCP_CLOSE_WAIT: >>> + sk->sk_err = EPIPE; >>> + break; >>> + case TCP_CLOSE: >>> + return; >>> + default: >>> + sk->sk_err = ECONNRESET; >>> + } >> >> Should we eventually move this out of tcp_reset(), export it (inline >> function?) and re-using it here to avoid the duplication (and eventually >> being out-of-sync in case of changes on TCP side? even if I guess this >> would not change but well) > > That could be a possible follow-up I guess, but I would refrain from > touching TCP to address issues/295. Sure, we can do that later as a clean-up. Patch 1/3 and 2/3 from this v3 are then good to me: Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> (I'm not going to apply them now as we need patch 3/3 and I had some questions there + packetdrill adaptations) Cheers, Matt
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index f4891db86217..b04f184695e4 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1685,9 +1685,13 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) { ret = sk_stream_wait_connect(sk, &timeo); if (ret) - goto out; + goto do_error; } + ret = -EPIPE; + if (unlikely(sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))) + goto do_error; + pfrag = sk_page_frag(sk); while (msg_data_left(msg)) { @@ -1696,11 +1700,6 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) bool dfrag_collapsed; size_t psize, offset; - if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) { - ret = -EPIPE; - goto out; - } - /* reuse tail pfrag, if possible, or carve a new one from the * page allocator */ @@ -1732,7 +1731,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) if (copy_page_from_iter(dfrag->page, offset, psize, &msg->msg_iter) != psize) { ret = -EFAULT; - goto out; + goto do_error; } /* data successfully copied into the write queue */ @@ -1764,7 +1763,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) __mptcp_push_pending(sk, msg->msg_flags); ret = sk_stream_wait_memory(sk, &timeo); if (ret) - goto out; + goto do_error; } if (copied) @@ -1772,7 +1771,14 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) out: release_sock(sk); - return copied ? : ret; + return copied; + +do_error: + if (copied) + goto out; + + copied = sk_stream_error(sk, msg->msg_flags, ret); + goto out; } static int __mptcp_recvmsg_mskq(struct mptcp_sock *msk, @@ -2404,12 +2410,31 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk) unlock_sock_fast(tcp_sk, slow); } + /* Mirror the tcp_reset() error propagation */ + switch (sk->sk_state) { + case TCP_SYN_SENT: + sk->sk_err = ECONNREFUSED; + break; + case TCP_CLOSE_WAIT: + sk->sk_err = EPIPE; + break; + case TCP_CLOSE: + return; + default: + sk->sk_err = ECONNRESET; + } + inet_sk_state_store(sk, TCP_CLOSE); sk->sk_shutdown = SHUTDOWN_MASK; smp_mb__before_atomic(); /* SHUTDOWN must be visible first */ set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags); - mptcp_close_wake_up(sk); + /* the calling mptcp_worker will properly destroy the socket */ + if (sock_flag(sk, SOCK_DEAD)) + return; + + sk->sk_state_change(sk); + sk_error_report(sk); } static void __mptcp_retrans(struct sock *sk)
When an mptcp socket is closed due to an incoming FASTCLOSE option, so specific sk_err is set and later syscall will fail usually with EPIPE. Align the current fastclose error handling with TCP reset, properly setting the socket error according to the current msk state and propagating such error. Additionally sendmsg() is currently not handling properly the sk_err, always returning EPIPE. Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- net/mptcp/protocol.c | 45 ++++++++++++++++++++++++++++++++++---------- 1 file changed, 35 insertions(+), 10 deletions(-)