diff mbox series

[bpf] skmsg: Fix invalid last sg check in sk_msg_recvmsg()

Message ID 20220628123616.186950-1-liujian56@huawei.com (mailing list archive)
State Accepted
Commit 9974d37ea75f01b47d16072b5dad305bd8d23fcc
Delegated to: BPF
Headers show
Series [bpf] skmsg: Fix invalid last sg check in sk_msg_recvmsg() | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers success CCed 15 of 15 maintainers
netdev/build_clang success Errors and warnings before: 6 this patch: 6
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 16 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-PR success PR summary
bpf/vmtest-bpf-VM_Test-3 success Logs for Kernel LATEST on z15 with gcc
bpf/vmtest-bpf-VM_Test-1 success Logs for Kernel LATEST on ubuntu-latest with gcc
bpf/vmtest-bpf-VM_Test-2 success Logs for Kernel LATEST on ubuntu-latest with llvm-15

Commit Message

liujian (CE) June 28, 2022, 12:36 p.m. UTC
In sk_psock_skb_ingress_enqueue function, if the linear area + nr_frags +
frag_list of the SKB has NR_MSG_FRAG_IDS blocks in total, skb_to_sgvec
will return NR_MSG_FRAG_IDS, then msg->sg.end will be set to
NR_MSG_FRAG_IDS, and in addition, (NR_MSG_FRAG_IDS - 1) is set to the last
SG of msg. Recv the msg in sk_msg_recvmsg, when i is (NR_MSG_FRAG_IDS - 1),
the sk_msg_iter_var_next(i) will change i to 0 (not NR_MSG_FRAG_IDS), the
judgment condition "msg_rx->sg.start==msg_rx->sg.end" and
"i != msg_rx->sg.end" can not work.

As a result, the processed msg cannot be deleted from ingress_msg list.
But the length of all the sge of the msg has changed to 0. Then the next
recvmsg syscall will process the msg repeatedly, because the length of sge
is 0, the -EFAULT error is always returned.

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56@huawei.com>
---
 net/core/skmsg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

John Fastabend June 30, 2022, 7:29 a.m. UTC | #1
Liu Jian wrote:
> In sk_psock_skb_ingress_enqueue function, if the linear area + nr_frags +
> frag_list of the SKB has NR_MSG_FRAG_IDS blocks in total, skb_to_sgvec
> will return NR_MSG_FRAG_IDS, then msg->sg.end will be set to
> NR_MSG_FRAG_IDS, and in addition, (NR_MSG_FRAG_IDS - 1) is set to the last
> SG of msg. Recv the msg in sk_msg_recvmsg, when i is (NR_MSG_FRAG_IDS - 1),
> the sk_msg_iter_var_next(i) will change i to 0 (not NR_MSG_FRAG_IDS), the
> judgment condition "msg_rx->sg.start==msg_rx->sg.end" and
> "i != msg_rx->sg.end" can not work.
> 
> As a result, the processed msg cannot be deleted from ingress_msg list.
> But the length of all the sge of the msg has changed to 0. Then the next
> recvmsg syscall will process the msg repeatedly, because the length of sge
> is 0, the -EFAULT error is always returned.
> 
> Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: Liu Jian <liujian56@huawei.com>
> ---
>  net/core/skmsg.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index b0fcd0200e84..a8dbea559c7f 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -462,7 +462,7 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
>  
>  			if (copied == len)
>  				break;
> -		} while (i != msg_rx->sg.end);
> +		} while (!sg_is_last(sge));
>  
>  		if (unlikely(peek)) {
>  			msg_rx = sk_psock_next_msg(psock, msg_rx);
> @@ -472,7 +472,7 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
>  		}
>  
>  		msg_rx->sg.start = i;
> -		if (!sge->length && msg_rx->sg.start == msg_rx->sg.end) {
> +		if (!sge->length && sg_is_last(sge)) {
>  			msg_rx = sk_psock_dequeue_msg(psock);
>  			kfree_sk_msg(msg_rx);
>  		}
> -- 
> 2.17.1
> 

Looks correct to me, but I'll test it tomorrow and add a reviewed-by and
tested-by then. Thanks!
John Fastabend July 1, 2022, 8:30 p.m. UTC | #2
John Fastabend wrote:
> Liu Jian wrote:
> > In sk_psock_skb_ingress_enqueue function, if the linear area + nr_frags +
> > frag_list of the SKB has NR_MSG_FRAG_IDS blocks in total, skb_to_sgvec
> > will return NR_MSG_FRAG_IDS, then msg->sg.end will be set to
> > NR_MSG_FRAG_IDS, and in addition, (NR_MSG_FRAG_IDS - 1) is set to the last
> > SG of msg. Recv the msg in sk_msg_recvmsg, when i is (NR_MSG_FRAG_IDS - 1),
> > the sk_msg_iter_var_next(i) will change i to 0 (not NR_MSG_FRAG_IDS), the
> > judgment condition "msg_rx->sg.start==msg_rx->sg.end" and
> > "i != msg_rx->sg.end" can not work.
> > 
> > As a result, the processed msg cannot be deleted from ingress_msg list.
> > But the length of all the sge of the msg has changed to 0. Then the next
> > recvmsg syscall will process the msg repeatedly, because the length of sge
> > is 0, the -EFAULT error is always returned.
> > 
> > Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
> > Signed-off-by: Liu Jian <liujian56@huawei.com>
> > ---
> >  net/core/skmsg.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> > index b0fcd0200e84..a8dbea559c7f 100644
> > --- a/net/core/skmsg.c
> > +++ b/net/core/skmsg.c
> > @@ -462,7 +462,7 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
> >  
> >  			if (copied == len)
> >  				break;
> > -		} while (i != msg_rx->sg.end);
> > +		} while (!sg_is_last(sge));
> >  
> >  		if (unlikely(peek)) {
> >  			msg_rx = sk_psock_next_msg(psock, msg_rx);
> > @@ -472,7 +472,7 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
> >  		}
> >  
> >  		msg_rx->sg.start = i;
> > -		if (!sge->length && msg_rx->sg.start == msg_rx->sg.end) {
> > +		if (!sge->length && sg_is_last(sge)) {
> >  			msg_rx = sk_psock_dequeue_msg(psock);
> >  			kfree_sk_msg(msg_rx);
> >  		}
> > -- 
> > 2.17.1
> > 
> 
> Looks correct to me, but I'll test it tomorrow and add a reviewed-by and
> tested-by then. Thanks!

Still testing but adding ack.

Acked-by: John Fastabend <john.fastabend@gmail.com>
patchwork-bot+netdevbpf@kernel.org July 11, 2022, 4:30 p.m. UTC | #3
Hello:

This patch was applied to bpf/bpf-next.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Tue, 28 Jun 2022 20:36:16 +0800 you wrote:
> In sk_psock_skb_ingress_enqueue function, if the linear area + nr_frags +
> frag_list of the SKB has NR_MSG_FRAG_IDS blocks in total, skb_to_sgvec
> will return NR_MSG_FRAG_IDS, then msg->sg.end will be set to
> NR_MSG_FRAG_IDS, and in addition, (NR_MSG_FRAG_IDS - 1) is set to the last
> SG of msg. Recv the msg in sk_msg_recvmsg, when i is (NR_MSG_FRAG_IDS - 1),
> the sk_msg_iter_var_next(i) will change i to 0 (not NR_MSG_FRAG_IDS), the
> judgment condition "msg_rx->sg.start==msg_rx->sg.end" and
> "i != msg_rx->sg.end" can not work.
> 
> [...]

Here is the summary with links:
  - [bpf] skmsg: Fix invalid last sg check in sk_msg_recvmsg()
    https://git.kernel.org/bpf/bpf-next/c/9974d37ea75f

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index b0fcd0200e84..a8dbea559c7f 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -462,7 +462,7 @@  int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
 
 			if (copied == len)
 				break;
-		} while (i != msg_rx->sg.end);
+		} while (!sg_is_last(sge));
 
 		if (unlikely(peek)) {
 			msg_rx = sk_psock_next_msg(psock, msg_rx);
@@ -472,7 +472,7 @@  int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
 		}
 
 		msg_rx->sg.start = i;
-		if (!sge->length && msg_rx->sg.start == msg_rx->sg.end) {
+		if (!sge->length && sg_is_last(sge)) {
 			msg_rx = sk_psock_dequeue_msg(psock);
 			kfree_sk_msg(msg_rx);
 		}