Message ID | 20230831013105.2930824-1-xukuohai@huaweicloud.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] selftests/bpf: fix a CI failure caused by vsock write | expand |
On 8/31/23 3:31 AM, Xu Kuohai wrote: > From: Xu Kuohai <xukuohai@huawei.com> > > While commit 90f0074cd9f9 ("selftests/bpf: fix a CI failure caused by vsock sockmap test") > fixes a receive failure of vsock sockmap test, there is still a write failure: > > Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir > Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir > ./test_progs:vsock_unix_redir_connectible:1501: egress: write: Transport endpoint is not connected > vsock_unix_redir_connectible:FAIL:1501 > ./test_progs:vsock_unix_redir_connectible:1501: ingress: write: Transport endpoint is not connected > vsock_unix_redir_connectible:FAIL:1501 > ./test_progs:vsock_unix_redir_connectible:1501: egress: write: Transport endpoint is not connected > vsock_unix_redir_connectible:FAIL:1501 > > The reason is that the vsock connection in the test is set to ESTABLISHED state > by function virtio_transport_recv_pkt, which is executed in a workqueue thread, > so when the user space test thread runs before the workqueue thread, this > problem occurs. > > To fix it, before writing the connection, wait for it to be connected. > > Fixes: d61bd8c1fd02 ("selftests/bpf: add a test case for vsock sockmap") > Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Thanks for the fix! Looks like this is gone now at least in the tests which succeed, but there are still two issues: 1) s390x fails in BPF CI as below: https://github.com/kernel-patches/bpf/actions/runs/6031993528/job/16366784236 Error: #211 sockmap_listen Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument vsock_socketpair_connectible:FAIL:1456 ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed vsock_unix_redir_connectible:FAIL:1494 ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument vsock_socketpair_connectible:FAIL:1456 ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed vsock_unix_redir_connectible:FAIL:1494 ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument vsock_socketpair_connectible:FAIL:1456 ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed vsock_unix_redir_connectible:FAIL:1494 ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument vsock_socketpair_connectible:FAIL:1456 ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed vsock_unix_redir_connectible:FAIL:1494 Error: #211/158 sockmap_listen/sockhash VSOCK test_vsock_redir Error: #211/158 sockmap_listen/sockhash VSOCK test_vsock_redir ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument vsock_socketpair_connectible:FAIL:1456 ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed vsock_unix_redir_connectible:FAIL:1494 ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument vsock_socketpair_connectible:FAIL:1456 ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed vsock_unix_redir_connectible:FAIL:1494 ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument vsock_socketpair_connectible:FAIL:1456 ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed vsock_unix_redir_connectible:FAIL:1494 ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument vsock_socketpair_connectible:FAIL:1456 ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed vsock_unix_redir_connectible:FAIL:1494 2) Various panics, some GPFs but also seen NULL pointer derefs, discussed in the other thread: https://lore.kernel.org/bpf/ZO+RQwJhPhYcNGAi@krava/ I believe issue 1) might still be related to your fix in here, ptal. Thanks, Daniel
On 8/31/2023 8:09 PM, Daniel Borkmann wrote: > On 8/31/23 3:31 AM, Xu Kuohai wrote: >> From: Xu Kuohai <xukuohai@huawei.com> >> >> While commit 90f0074cd9f9 ("selftests/bpf: fix a CI failure caused by vsock sockmap test") >> fixes a receive failure of vsock sockmap test, there is still a write failure: >> >> Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir >> Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir >> ./test_progs:vsock_unix_redir_connectible:1501: egress: write: Transport endpoint is not connected >> vsock_unix_redir_connectible:FAIL:1501 >> ./test_progs:vsock_unix_redir_connectible:1501: ingress: write: Transport endpoint is not connected >> vsock_unix_redir_connectible:FAIL:1501 >> ./test_progs:vsock_unix_redir_connectible:1501: egress: write: Transport endpoint is not connected >> vsock_unix_redir_connectible:FAIL:1501 >> >> The reason is that the vsock connection in the test is set to ESTABLISHED state >> by function virtio_transport_recv_pkt, which is executed in a workqueue thread, >> so when the user space test thread runs before the workqueue thread, this >> problem occurs. >> >> To fix it, before writing the connection, wait for it to be connected. >> >> Fixes: d61bd8c1fd02 ("selftests/bpf: add a test case for vsock sockmap") >> Signed-off-by: Xu Kuohai <xukuohai@huawei.com> > > Thanks for the fix! Looks like this is gone now at least in the tests which succeed, > but there are still two issues: > > 1) s390x fails in BPF CI as below: > > https://github.com/kernel-patches/bpf/actions/runs/6031993528/job/16366784236 > > Error: #211 sockmap_listen > Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir > Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir > ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument > vsock_socketpair_connectible:FAIL:1456 > ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed > vsock_unix_redir_connectible:FAIL:1494 > ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument > vsock_socketpair_connectible:FAIL:1456 > ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed > vsock_unix_redir_connectible:FAIL:1494 > ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument > vsock_socketpair_connectible:FAIL:1456 > ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed > vsock_unix_redir_connectible:FAIL:1494 > ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument > vsock_socketpair_connectible:FAIL:1456 > ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed > vsock_unix_redir_connectible:FAIL:1494 > Error: #211/158 sockmap_listen/sockhash VSOCK test_vsock_redir > Error: #211/158 sockmap_listen/sockhash VSOCK test_vsock_redir > ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument > vsock_socketpair_connectible:FAIL:1456 > ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed > vsock_unix_redir_connectible:FAIL:1494 > ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument > vsock_socketpair_connectible:FAIL:1456 > ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed > vsock_unix_redir_connectible:FAIL:1494 > ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument > vsock_socketpair_connectible:FAIL:1456 > ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed > vsock_unix_redir_connectible:FAIL:1494 > ./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument > vsock_socketpair_connectible:FAIL:1456 > ./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed > vsock_unix_redir_connectible:FAIL:1494 > Oops, I think it's because the esize variable is not initialized, causing getsockopt to read a garbage value. > 2) Various panics, some GPFs but also seen NULL pointer derefs, discussed in the other > thread: https://lore.kernel.org/bpf/ZO+RQwJhPhYcNGAi@krava/ > still debugging ... > I believe issue 1) might still be related to your fix in here, ptal. > Sorry for introducing issue 1), will post a fix soon. > Thanks, > Daniel > > .
On Thu, 2023-08-31 at 09:31 +0800, Xu Kuohai wrote: > From: Xu Kuohai <xukuohai@huawei.com> > > While commit 90f0074cd9f9 ("selftests/bpf: fix a CI failure caused by vsock sockmap test") > fixes a receive failure of vsock sockmap test, there is still a write failure: > > Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir > Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir > ./test_progs:vsock_unix_redir_connectible:1501: egress: write: Transport endpoint is not connected > vsock_unix_redir_connectible:FAIL:1501 > ./test_progs:vsock_unix_redir_connectible:1501: ingress: write: Transport endpoint is not connected > vsock_unix_redir_connectible:FAIL:1501 > ./test_progs:vsock_unix_redir_connectible:1501: egress: write: Transport endpoint is not connected > vsock_unix_redir_connectible:FAIL:1501 > > The reason is that the vsock connection in the test is set to ESTABLISHED state > by function virtio_transport_recv_pkt, which is executed in a workqueue thread, > so when the user space test thread runs before the workqueue thread, this > problem occurs. > > To fix it, before writing the connection, wait for it to be connected. Fun fact: while trying this patch I hit an oops [1]. I'm currently trying to bisect the commit causing this oops and have the following observation: - good revisions don't have this test as flip-flop - bad revisions have this test as flip-flop. [1] https://lore.kernel.org/bpf/de816b89073544deb2ce34c4b242d583a6d4660f.camel@gmail.com/ > > Fixes: d61bd8c1fd02 ("selftests/bpf: add a test case for vsock sockmap") > Signed-off-by: Xu Kuohai <xukuohai@huawei.com> > --- > .../bpf/prog_tests/sockmap_helpers.h | 29 +++++++++++++++++++ > .../selftests/bpf/prog_tests/sockmap_listen.c | 5 ++++ > 2 files changed, 34 insertions(+) > > diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h b/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h > index d12665490a90..837dfb0361c6 100644 > --- a/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h > +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h > @@ -179,6 +179,35 @@ > __ret; \ > }) > > +static inline int poll_connect(int fd, unsigned int timeout_sec) > +{ > + struct timeval timeout = { .tv_sec = timeout_sec }; > + fd_set wfds; > + int r; > + int eval; > + socklen_t esize; > + > + FD_ZERO(&wfds); > + FD_SET(fd, &wfds); > + > + r = select(fd + 1, NULL, &wfds, NULL, &timeout); > + if (r == 0) > + errno = ETIME; > + > + if (r != 1) > + return -1; > + > + if (getsockopt(fd, SOL_SOCKET, SO_ERROR, &eval, &esize) < 0) > + return -1; > + > + if (eval != 0) { > + errno = eval; > + return -1; > + } > + > + return 0; > +} > + > static inline int poll_read(int fd, unsigned int timeout_sec) > { > struct timeval timeout = { .tv_sec = timeout_sec }; > diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c > index 5674a9d0cacf..2d3bf38677b6 100644 > --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c > +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c > @@ -1452,6 +1452,11 @@ static int vsock_socketpair_connectible(int sotype, int *v0, int *v1) > if (p < 0) > goto close_cli; > > + if (poll_connect(c, IO_TIMEOUT_SEC) < 0) { > + FAIL_ERRNO("poll_connect"); > + goto close_cli; > + } > + > *v0 = p; > *v1 = c; >
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h b/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h index d12665490a90..837dfb0361c6 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h @@ -179,6 +179,35 @@ __ret; \ }) +static inline int poll_connect(int fd, unsigned int timeout_sec) +{ + struct timeval timeout = { .tv_sec = timeout_sec }; + fd_set wfds; + int r; + int eval; + socklen_t esize; + + FD_ZERO(&wfds); + FD_SET(fd, &wfds); + + r = select(fd + 1, NULL, &wfds, NULL, &timeout); + if (r == 0) + errno = ETIME; + + if (r != 1) + return -1; + + if (getsockopt(fd, SOL_SOCKET, SO_ERROR, &eval, &esize) < 0) + return -1; + + if (eval != 0) { + errno = eval; + return -1; + } + + return 0; +} + static inline int poll_read(int fd, unsigned int timeout_sec) { struct timeval timeout = { .tv_sec = timeout_sec }; diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c index 5674a9d0cacf..2d3bf38677b6 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c @@ -1452,6 +1452,11 @@ static int vsock_socketpair_connectible(int sotype, int *v0, int *v1) if (p < 0) goto close_cli; + if (poll_connect(c, IO_TIMEOUT_SEC) < 0) { + FAIL_ERRNO("poll_connect"); + goto close_cli; + } + *v0 = p; *v1 = c;