Message ID | 20230407171654.107311-5-john.fastabend@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | BPF |
Headers | show |
Series | bpf sockmap fixes | expand |
On Fri, Apr 07, 2023 at 10:16 AM -07, John Fastabend wrote: > The sockmap code is returning EAGAIN after a FIN packet is received and no > more data is on the receive queue. Correct behavior is to return 0 to the > user and the user can then close the socket. The EAGAIN causes many apps > to retry which masks the problem. Eventually the socket is evicted from > the sockmap because its released from sockmap sock free handling. The > issue creates a delay and can cause some errors on application side. > > To fix this check on sk_msg_recvmsg side if length is zero and FIN flag > is set then set return to zero. A selftest will be added to check this > condition. > > Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()") > Tested-by: William Findlay <will@isovalent.com> > Signed-off-by: John Fastabend <john.fastabend@gmail.com> > --- Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index ebf917511937..804bd0c247d0 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -174,6 +174,24 @@ static int tcp_msg_wait_data(struct sock *sk, struct sk_psock *psock, return ret; } +static bool is_next_msg_fin(struct sk_psock *psock) +{ + struct scatterlist *sge; + struct sk_msg *msg_rx; + int i; + + msg_rx = sk_psock_peek_msg(psock); + i = msg_rx->sg.start; + sge = sk_msg_elem(msg_rx, i); + if (!sge->length) { + struct sk_buff *skb = msg_rx->skb; + + if (skb && TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN) + return true; + } + return false; +} + static int tcp_bpf_recvmsg_parser(struct sock *sk, struct msghdr *msg, size_t len, @@ -196,6 +214,19 @@ static int tcp_bpf_recvmsg_parser(struct sock *sk, lock_sock(sk); msg_bytes_ready: copied = sk_msg_recvmsg(sk, psock, msg, len, flags); + /* The typical case for EFAULT is the socket was gracefully + * shutdown with a FIN pkt. So check here the other case is + * some error on copy_page_to_iter which would be unexpected. + * On fin return correct return code to zero. + */ + if (copied == -EFAULT) { + bool is_fin = is_next_msg_fin(psock); + + if (is_fin) { + copied = 0; + goto out; + } + } if (!copied) { long timeo; int data;