diff mbox series

[mptcp-next] Squash to "selftests: mptcp: add MP_FAIL reset testcase" - cleanups

Message ID cdcd6b9963a1c15b20ffcf71e06b6de70f67ec4a.1651664645.git.geliang.tang@suse.com (mailing list archive)
State Superseded, archived
Headers show
Series [mptcp-next] Squash to "selftests: mptcp: add MP_FAIL reset testcase" - cleanups | expand

Checks

Context Check Description
matttbe/build fail Build error with: make -C tools/testing/selftests/net/mptcp
matttbe/checkpatch success total: 0 errors, 0 warnings, 0 checks, 10 lines checked
matttbe/KVM_Validation__normal fail Script error! ❓
matttbe/KVM_Validation__debug fail Script error! ❓

Commit Message

Geliang Tang May 4, 2022, 11:44 a.m. UTC
Two small cleanups for MP_FAIL reset test case:

Reduce the test files size from 1024 to 128, to make the test faster.
With Paolo's fix [2] for MP_FAIL test-case, the test becomes very slow.

Drop '+'s passed to chk_csum_nr, with Paolo's fix [1] for act_pedit, no
extra checksum failures now, no need to add '+'s anymore.

Depends on Paolo's commits:
[1] net/sched: act_pedit: really ensure the skb is writable
[2] selftests: mptcp: fix MP_FAIL test-case

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

MPTCP CI May 4, 2022, 11:51 a.m. UTC | #1
Hi Geliang,

Thank you for your modifications, that's great!

But sadly, our CI spotted some issues with it when trying to build it.

You can find more details there:

  https://patchwork.kernel.org/project/mptcp/patch/cdcd6b9963a1c15b20ffcf71e06b6de70f67ec4a.1651664645.git.geliang.tang@suse.com/
  https://github.com/multipath-tcp/mptcp_net-next/actions/runs/2269548473

Status: failure
Initiator: MPTCPimporter
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/cd2b6a887541

Feel free to reply to this email if you cannot access logs, if you need
some support to fix the error, if this doesn't seem to be caused by your
modifications or if the error is a false positive one.

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
Geliang Tang May 4, 2022, 12:14 p.m. UTC | #2
On Wed, May 04, 2022 at 07:44:24PM +0800, Geliang Tang wrote:
> Two small cleanups for MP_FAIL reset test case:
> 
> Reduce the test files size from 1024 to 128, to make the test faster.

The original commit log needs to update too:

 1024KB -> 128KB

'''
Add the multiple subflows test case for MP_FAIL, to test the MP_FAIL
reset case. Use the test_linkfail value to make 128KB test files.

...
...
'''

> With Paolo's fix [2] for MP_FAIL test-case, the test becomes very slow.
> 
> Drop '+'s passed to chk_csum_nr, with Paolo's fix [1] for act_pedit, no
> extra checksum failures now, no need to add '+'s anymore.
> 
> Depends on Paolo's commits:
> [1] net/sched: act_pedit: really ensure the skb is writable
> [2] selftests: mptcp: fix MP_FAIL test-case
> 
> Signed-off-by: Geliang Tang <geliang.tang@suse.com>
> ---
>  tools/testing/selftests/net/mptcp/mptcp_join.sh | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> index a5e90532eaea..91c840ec91c8 100755
> --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> @@ -2739,8 +2739,8 @@ fail_tests()
>  		pm_nl_set_limits $ns1 0 1
>  		pm_nl_set_limits $ns2 0 1
>  		pm_nl_add_endpoint $ns2 10.0.2.2 dev ns2eth2 flags subflow
> -		run_tests $ns1 $ns2 10.0.1.1 1024
> -		chk_join_nr 1 1 1 +1 +0 1 1 0 "$(pedit_action_pkts)"
> +		run_tests $ns1 $ns2 10.0.1.1 128
> +		chk_join_nr 1 1 1 1 0 1 1 0 "$(pedit_action_pkts)"
>  	fi
>  }
>  
> -- 
> 2.34.1
>
MPTCP CI May 4, 2022, 12:21 p.m. UTC | #3
Hi Geliang,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal:
  - Script error! ❓:
  - Task: https://cirrus-ci.com/task/5972194063810560
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5972194063810560/summary/summary.txt

- KVM Validation: debug:
  - Script error! ❓:
  - Task: https://cirrus-ci.com/task/5409244110389248
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5409244110389248/summary/summary.txt

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/cd2b6a887541


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-debug

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
Paolo Abeni May 4, 2022, 1:03 p.m. UTC | #4
Hi Geliang,

On Wed, 2022-05-04 at 20:14 +0800, Geliang Tang wrote:
> On Wed, May 04, 2022 at 07:44:24PM +0800, Geliang Tang wrote:
> > Two small cleanups for MP_FAIL reset test case:
> > 
> > Reduce the test files size from 1024 to 128, to make the test faster.
> 
> The original commit log needs to update too:
> 
>  1024KB -> 128KB
> 
> '''
> Add the multiple subflows test case for MP_FAIL, to test the MP_FAIL
> reset case. Use the test_linkfail value to make 128KB test files.
> 
> ...
> ...
> '''
> 
> > With Paolo's fix [2] for MP_FAIL test-case, the test becomes very slow.
> > 
> > Drop '+'s passed to chk_csum_nr, with Paolo's fix [1] for act_pedit, no
> > extra checksum failures now, no need to add '+'s anymore.
> > 
> > Depends on Paolo's commits:
> > [1] net/sched: act_pedit: really ensure the skb is writable
> > [2] selftests: mptcp: fix MP_FAIL test-case
> > 
> > Signed-off-by: Geliang Tang <geliang.tang@suse.com>

I noticed the the time increase. Unfortunatelly, it may be necessary to
be reliable in all environment, comprising the slowers one.

Mat (M.) was able to reproduce some failure with b/w == 10Mbps and size
== 1M. I think/guess he could hit the same failure even with 1Mbps and
size == 128K.

We need at least some additional testing for this one.

Thanks!

Paolo
Matthieu Baerts May 4, 2022, 1:08 p.m. UTC | #5
Hi Paolo, Geliang,

On 04/05/2022 15:03, Paolo Abeni wrote:
> Hi Geliang,
> 
> On Wed, 2022-05-04 at 20:14 +0800, Geliang Tang wrote:
>> On Wed, May 04, 2022 at 07:44:24PM +0800, Geliang Tang wrote:
>>> Two small cleanups for MP_FAIL reset test case:
>>>
>>> Reduce the test files size from 1024 to 128, to make the test faster.
>>
>> The original commit log needs to update too:
>>
>>  1024KB -> 128KB
>>
>> '''
>> Add the multiple subflows test case for MP_FAIL, to test the MP_FAIL
>> reset case. Use the test_linkfail value to make 128KB test files.
>>
>> ...
>> ...
>> '''
>>
>>> With Paolo's fix [2] for MP_FAIL test-case, the test becomes very slow.
>>>
>>> Drop '+'s passed to chk_csum_nr, with Paolo's fix [1] for act_pedit, no
>>> extra checksum failures now, no need to add '+'s anymore.
>>>
>>> Depends on Paolo's commits:
>>> [1] net/sched: act_pedit: really ensure the skb is writable
>>> [2] selftests: mptcp: fix MP_FAIL test-case
>>>
>>> Signed-off-by: Geliang Tang <geliang.tang@suse.com>
> 
> I noticed the the time increase. Unfortunatelly, it may be necessary to
> be reliable in all environment, comprising the slowers one.

I might have not follow what's the root cause but would it help to add
some latency on the primary link to force the kernel to use the second one?

Cheers,
Matt
Geliang Tang May 4, 2022, 1:21 p.m. UTC | #6
Paolo Abeni <pabeni@redhat.com> 于2022年5月4日周三 21:04写道:
>
> Hi Geliang,
>
> On Wed, 2022-05-04 at 20:14 +0800, Geliang Tang wrote:
> > On Wed, May 04, 2022 at 07:44:24PM +0800, Geliang Tang wrote:
> > > Two small cleanups for MP_FAIL reset test case:
> > >
> > > Reduce the test files size from 1024 to 128, to make the test faster.
> >
> > The original commit log needs to update too:
> >
> >  1024KB -> 128KB
> >
> > '''
> > Add the multiple subflows test case for MP_FAIL, to test the MP_FAIL
> > reset case. Use the test_linkfail value to make 128KB test files.
> >
> > ...
> > ...
> > '''
> >
> > > With Paolo's fix [2] for MP_FAIL test-case, the test becomes very slow.
> > >
> > > Drop '+'s passed to chk_csum_nr, with Paolo's fix [1] for act_pedit, no
> > > extra checksum failures now, no need to add '+'s anymore.
> > >
> > > Depends on Paolo's commits:
> > > [1] net/sched: act_pedit: really ensure the skb is writable
> > > [2] selftests: mptcp: fix MP_FAIL test-case
> > >
> > > Signed-off-by: Geliang Tang <geliang.tang@suse.com>
>
> I noticed the the time increase. Unfortunatelly, it may be necessary to
> be reliable in all environment, comprising the slowers one.
>
> Mat (M.) was able to reproduce some failure with b/w == 10Mbps and size
> == 1M. I think/guess he could hit the same failure even with 1Mbps and
> size == 128K.
>
> We need at least some additional testing for this one.

If so, let's continue to use 1024k test file sizes.

>
> Thanks!
>
> Paolo
>
>
Mat Martineau May 4, 2022, 11 p.m. UTC | #7
On Wed, 4 May 2022, Geliang Tang wrote:

> Paolo Abeni <pabeni@redhat.com> 于2022年5月4日周三 21:04写道:
>>
>> Hi Geliang,
>>
>> On Wed, 2022-05-04 at 20:14 +0800, Geliang Tang wrote:
>>> On Wed, May 04, 2022 at 07:44:24PM +0800, Geliang Tang wrote:
>>>> Two small cleanups for MP_FAIL reset test case:
>>>>
>>>> Reduce the test files size from 1024 to 128, to make the test faster.
>>>
>>> The original commit log needs to update too:
>>>
>>>  1024KB -> 128KB
>>>
>>> '''
>>> Add the multiple subflows test case for MP_FAIL, to test the MP_FAIL
>>> reset case. Use the test_linkfail value to make 128KB test files.
>>>
>>> ...
>>> ...
>>> '''
>>>
>>>> With Paolo's fix [2] for MP_FAIL test-case, the test becomes very slow.
>>>>
>>>> Drop '+'s passed to chk_csum_nr, with Paolo's fix [1] for act_pedit, no
>>>> extra checksum failures now, no need to add '+'s anymore.
>>>>
>>>> Depends on Paolo's commits:
>>>> [1] net/sched: act_pedit: really ensure the skb is writable
>>>> [2] selftests: mptcp: fix MP_FAIL test-case
>>>>
>>>> Signed-off-by: Geliang Tang <geliang.tang@suse.com>
>>
>> I noticed the the time increase. Unfortunatelly, it may be necessary to
>> be reliable in all environment, comprising the slowers one.
>>
>> Mat (M.) was able to reproduce some failure with b/w == 10Mbps and size
>> == 1M. I think/guess he could hit the same failure even with 1Mbps and
>> size == 128K.
>>
>> We need at least some additional testing for this one.
>
> If so, let's continue to use 1024k test file sizes.
>

It's reliable for me at 128k so far. I'm fine with squashing the patch 
as-is, we can increase the file size if CI (or humans) complain.

--
Mat Martineau
Intel
Mat Martineau May 4, 2022, 11:39 p.m. UTC | #8
On Wed, 4 May 2022, Mat Martineau wrote:

> On Wed, 4 May 2022, Geliang Tang wrote:
>
>> Paolo Abeni <pabeni@redhat.com> 于2022年5月4日周三 21:04写道:
>>> 
>>> Hi Geliang,
>>> 
>>> On Wed, 2022-05-04 at 20:14 +0800, Geliang Tang wrote:
>>>> On Wed, May 04, 2022 at 07:44:24PM +0800, Geliang Tang wrote:
>>>>> Two small cleanups for MP_FAIL reset test case:
>>>>> 
>>>>> Reduce the test files size from 1024 to 128, to make the test faster.
>>>> 
>>>> The original commit log needs to update too:
>>>>
>>>>  1024KB -> 128KB
>>>> 
>>>> '''
>>>> Add the multiple subflows test case for MP_FAIL, to test the MP_FAIL
>>>> reset case. Use the test_linkfail value to make 128KB test files.
>>>> 
>>>> ...
>>>> ...
>>>> '''
>>>> 
>>>>> With Paolo's fix [2] for MP_FAIL test-case, the test becomes very slow.
>>>>> 
>>>>> Drop '+'s passed to chk_csum_nr, with Paolo's fix [1] for act_pedit, no
>>>>> extra checksum failures now, no need to add '+'s anymore.
>>>>> 
>>>>> Depends on Paolo's commits:
>>>>> [1] net/sched: act_pedit: really ensure the skb is writable
>>>>> [2] selftests: mptcp: fix MP_FAIL test-case
>>>>> 
>>>>> Signed-off-by: Geliang Tang <geliang.tang@suse.com>
>>> 
>>> I noticed the the time increase. Unfortunatelly, it may be necessary to
>>> be reliable in all environment, comprising the slowers one.
>>> 
>>> Mat (M.) was able to reproduce some failure with b/w == 10Mbps and size
>>> == 1M. I think/guess he could hit the same failure even with 1Mbps and
>>> size == 128K.
>>> 
>>> We need at least some additional testing for this one.
>> 
>> If so, let's continue to use 1024k test file sizes.
>> 
>
> It's reliable for me at 128k so far. I'm fine with squashing the patch as-is, 
> we can increase the file size if CI (or humans) complain.
>

It did fail after 193 iterations (kernel with debug config):

002 MP_FAIL MP_RST: 1 corrupted pkts     syn[ ok ] - synack[ ok ] - ack[ ok ]
                                          sum[ ok ] - csum  [ ok ]
                                          ftx[ ok ] - failrx[fail] got 0 MP_FAIL[s] RX expected 1

So 128k seems to be too small.

--
Mat Martineau
Intel
diff mbox series

Patch

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index a5e90532eaea..91c840ec91c8 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -2739,8 +2739,8 @@  fail_tests()
 		pm_nl_set_limits $ns1 0 1
 		pm_nl_set_limits $ns2 0 1
 		pm_nl_add_endpoint $ns2 10.0.2.2 dev ns2eth2 flags subflow
-		run_tests $ns1 $ns2 10.0.1.1 1024
-		chk_join_nr 1 1 1 +1 +0 1 1 0 "$(pedit_action_pkts)"
+		run_tests $ns1 $ns2 10.0.1.1 128
+		chk_join_nr 1 1 1 1 0 1 1 0 "$(pedit_action_pkts)"
 	fi
 }