diff mbox series

[net,v3,3/3] selftests: fib_tests: Add multipath list receive tests

Message ID 20230828113221.20123-4-sriram.yagnaraman@est.tech (mailing list archive)
State New
Headers show
Series Avoid TCP resets when using ECMP for load-balancing between multiple servers. | expand

Commit Message

Sriram Yagnaraman Aug. 28, 2023, 11:32 a.m. UTC
The test uses perf stat to count the number of fib:fib_table_lookup
tracepoint hits for IPv4 and the number of fib6:fib6_table_lookup for
IPv6. The measured count is checked to be within 5% of the total number
of packets sent via veth1.

Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
---
 tools/testing/selftests/net/fib_tests.sh | 150 ++++++++++++++++++++++-
 1 file changed, 149 insertions(+), 1 deletion(-)

Comments

Ido Schimmel Aug. 28, 2023, 3:24 p.m. UTC | #1
On Mon, Aug 28, 2023 at 01:32:21PM +0200, Sriram Yagnaraman wrote:
> The test uses perf stat to count the number of fib:fib_table_lookup
> tracepoint hits for IPv4 and the number of fib6:fib6_table_lookup for
> IPv6. The measured count is checked to be within 5% of the total number
> of packets sent via veth1.
> 
> Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>

I just tested this with a debug config and noticed that the single path
test is not very stable. It's not really related to the bug fix, so I
think you can simply remove it.

Jakub / Paolo, this change conflicts with changes in net-next and I
assume that the next PR that you are going to send is from net-next.
What is your preference in this case? Wait for the PR to be accepted and
for master to be merged into net?

Thanks
David Ahern Aug. 28, 2023, 6:57 p.m. UTC | #2
On 8/28/23 9:24 AM, Ido Schimmel wrote:
> On Mon, Aug 28, 2023 at 01:32:21PM +0200, Sriram Yagnaraman wrote:
>> The test uses perf stat to count the number of fib:fib_table_lookup
>> tracepoint hits for IPv4 and the number of fib6:fib6_table_lookup for
>> IPv6. The measured count is checked to be within 5% of the total number
>> of packets sent via veth1.
>>
>> Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
> 
> I just tested this with a debug config and noticed that the single path
> test is not very stable. It's not really related to the bug fix, so I
> think you can simply remove it.
> 
> Jakub / Paolo, this change conflicts with changes in net-next and I
> assume that the next PR that you are going to send is from net-next.
> What is your preference in this case? Wait for the PR to be accepted and
> for master to be merged into net?
> 
> Thanks

Ido, thanks for staying on top of this change and the details with the
test cases.
Jakub Kicinski Aug. 28, 2023, 7:14 p.m. UTC | #3
On Mon, 28 Aug 2023 18:24:20 +0300 Ido Schimmel wrote:
> Jakub / Paolo, this change conflicts with changes in net-next and I
> assume that the next PR that you are going to send is from net-next.
> What is your preference in this case? Wait for the PR to be accepted and
> for master to be merged into net?

The trees will be merged before the PR, in the next 24h.
As soon as they are you can resend for net-next.
Sriram Yagnaraman Aug. 30, 2023, 9:17 a.m. UTC | #4
> -----Original Message-----
> From: Ido Schimmel <idosch@idosch.org>
> Sent: Monday, 28 August 2023 17:24
> To: Sriram Yagnaraman <sriram.yagnaraman@est.tech>; kuba@kernel.org;
> pabeni@redhat.com
> Cc: netdev@vger.kernel.org; linux-kselftest@vger.kernel.org; David S . Miller
> <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub
> Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; David Ahern
> <dsahern@kernel.org>; Ido Schimmel <idosch@nvidia.com>; Shuah Khan
> <shuah@kernel.org>; Petr Machata <petrm@nvidia.com>
> Subject: Re: [PATCH net v3 3/3] selftests: fib_tests: Add multipath list receive
> tests
> 
> On Mon, Aug 28, 2023 at 01:32:21PM +0200, Sriram Yagnaraman wrote:
> > The test uses perf stat to count the number of fib:fib_table_lookup
> > tracepoint hits for IPv4 and the number of fib6:fib6_table_lookup for
> > IPv6. The measured count is checked to be within 5% of the total
> > number of packets sent via veth1.
> >
> > Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
> 
> I just tested this with a debug config and noticed that the single path test is not
> very stable. It's not really related to the bug fix, so I think you can simply
> remove it.
> 

Sent v4 with just the multipath test and rebased to latest after the merge with net-next. 

If it is OK with all of you here, should I try to improve this test to verify TCP resets don't happen when the nexthop is in a multipath group, perhaps using iperf3? I can send another patch if/when I get something working.
Ido Schimmel Aug. 30, 2023, 3:42 p.m. UTC | #5
On Wed, Aug 30, 2023 at 09:17:22AM +0000, Sriram Yagnaraman wrote:
> If it is OK with all of you here, should I try to improve this test to verify TCP resets don't happen when the nexthop is in a multipath group, perhaps using iperf3? I can send another patch if/when I get something working.

Yes, just make sure it's stable. That is, the test reliably fails
without the fixes and reliably passes with the fixes.
diff mbox series

Patch

diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
index 35d89dfa6f11..1cf78cf4d346 100755
--- a/tools/testing/selftests/net/fib_tests.sh
+++ b/tools/testing/selftests/net/fib_tests.sh
@@ -9,7 +9,7 @@  ret=0
 ksft_skip=4
 
 # all tests in this script. Can be overridden with -t option
-TESTS="unregister down carrier nexthop suppress ipv6_notify ipv4_notify ipv6_rt ipv4_rt ipv6_addr_metric ipv4_addr_metric ipv6_route_metrics ipv4_route_metrics ipv4_route_v6_gw rp_filter ipv4_del_addr ipv4_mangle ipv6_mangle ipv4_bcast_neigh"
+TESTS="unregister down carrier nexthop suppress ipv6_notify ipv4_notify ipv6_rt ipv4_rt ipv6_addr_metric ipv4_addr_metric ipv6_route_metrics ipv4_route_metrics ipv4_route_v6_gw rp_filter ipv4_del_addr ipv4_mangle ipv6_mangle ipv4_bcast_neigh ipv4_mpath_list ipv6_mpath_list"
 
 VERBOSE=0
 PAUSE_ON_FAIL=no
@@ -2138,6 +2138,152 @@  ipv4_bcast_neigh_test()
 	cleanup
 }
 
+mpath_dep_check()
+{
+	if [ ! -x "$(command -v mausezahn)" ]; then
+		echo "mausezahn command not found. Skipping test"
+		return 1
+	fi
+
+	if [ ! -x "$(command -v jq)" ]; then
+		echo "jq command not found. Skipping test"
+		return 1
+	fi
+
+	if [ ! -x "$(command -v bc)" ]; then
+		echo "bc command not found. Skipping test"
+		return 1
+	fi
+
+	if [ ! -x "$(command -v perf)" ]; then
+		echo "perf command not found. Skipping test"
+		return 1
+	fi
+
+	perf list fib:* | grep -q fib_table_lookup
+	if [ $? -ne 0 ]; then
+		echo "IPv4 FIB tracepoint not found. Skipping test"
+		return 1
+	fi
+
+	perf list fib6:* | grep -q fib6_table_lookup
+	if [ $? -ne 0 ]; then
+		echo "IPv6 FIB tracepoint not found. Skipping test"
+		return 1
+	fi
+
+	return 0
+}
+
+list_rcv_eval()
+{
+	local name=$1; shift
+	local file=$1; shift
+	local expected=$1; shift
+	local exp=$1; shift
+
+
+	local count=$(tail -n 1 $file | jq '.["counter-value"] | tonumber | floor')
+	local ratio=$(echo "scale=2; $count / $expected" | bc -l)
+	local res=$(echo "$ratio $exp" | bc)
+	[[ $res -eq 1 ]]
+	log_test $? 0 "$name route hit ratio ($ratio)"
+}
+
+ipv4_mpath_list_test()
+{
+	echo
+	echo "IPv4 multipath list receive tests"
+
+	mpath_dep_check || return 1
+
+	route_setup
+
+	set -e
+	run_cmd "ip netns exec ns1 ethtool -K veth1 tcp-segmentation-offload off"
+
+	run_cmd "ip netns exec ns2 bash -c \"echo 20000 > /sys/class/net/veth2/gro_flush_timeout\""
+	run_cmd "ip netns exec ns2 bash -c \"echo 1 > /sys/class/net/veth2/napi_defer_hard_irqs\""
+	run_cmd "ip netns exec ns2 ethtool -K veth2 generic-receive-offload on"
+	run_cmd "ip -n ns2 link add name nh1 up type dummy"
+	run_cmd "ip -n ns2 link add name nh2 up type dummy"
+	run_cmd "ip -n ns2 address add 172.16.201.1/24 dev nh1"
+	run_cmd "ip -n ns2 address add 172.16.202.1/24 dev nh2"
+	run_cmd "ip -n ns2 neigh add 172.16.201.2 lladdr 00:11:22:33:44:55 nud perm dev nh1"
+	run_cmd "ip -n ns2 neigh add 172.16.202.2 lladdr 00:aa:bb:cc:dd:ee nud perm dev nh2"
+	run_cmd "ip -n ns2 route add 203.0.113.0/24
+		nexthop via 172.16.201.2 nexthop via 172.16.202.2"
+	run_cmd "ip netns exec ns2 sysctl -qw net.ipv4.fib_multipath_hash_policy=1"
+	set +e
+
+	local dmac=$(ip -n ns2 -j link show dev veth2 | jq -r '.[]["address"]')
+	local tmp_file=$(mktemp)
+	local cmd="ip netns exec ns1 mausezahn veth1 -a own -b $dmac
+		-A 172.16.101.1 -B 203.0.113.1 -t udp 'sp=12345,dp=0-65535' -q"
+
+	# Packets forwarded in a list using a multipath route must not reuse a
+	# cached result so that a flow always hits the same nexthop. In other
+	# words, the FIB lookup tracepoint needs to be triggered for every
+	# packet.
+	run_cmd "perf stat -e fib:fib_table_lookup --filter 'err == 0' -j -o $tmp_file -- $cmd"
+	list_rcv_eval "Multipath" $tmp_file 65536 ">= 0.95"
+
+	# The same is not true for a single path route.
+	run_cmd "ip -n ns2 route replace 203.0.113.0/24 nexthop via 172.16.201.2"
+	run_cmd "perf stat -e fib:fib_table_lookup --filter 'err == 0' -j -o $tmp_file -- $cmd"
+	list_rcv_eval "Single path" $tmp_file 65536 "< 0.95"
+
+	rm $tmp_file
+	route_cleanup
+}
+
+ipv6_mpath_list_test()
+{
+	echo
+	echo "IPv6 multipath list receive tests"
+
+	mpath_dep_check || return 1
+
+	route_setup
+
+	set -e
+	run_cmd "ip netns exec ns1 ethtool -K veth1 tcp-segmentation-offload off"
+
+	run_cmd "ip netns exec ns2 bash -c \"echo 20000 > /sys/class/net/veth2/gro_flush_timeout\""
+	run_cmd "ip netns exec ns2 bash -c \"echo 1 > /sys/class/net/veth2/napi_defer_hard_irqs\""
+	run_cmd "ip netns exec ns2 ethtool -K veth2 generic-receive-offload on"
+	run_cmd "ip -n ns2 link add name nh1 up type dummy"
+	run_cmd "ip -n ns2 link add name nh2 up type dummy"
+	run_cmd "ip -n ns2 -6 address add 2001:db8:201::1/64 dev nh1"
+	run_cmd "ip -n ns2 -6 address add 2001:db8:202::1/64 dev nh2"
+	run_cmd "ip -n ns2 -6 neigh add 2001:db8:201::2 lladdr 00:11:22:33:44:55 nud perm dev nh1"
+	run_cmd "ip -n ns2 -6 neigh add 2001:db8:202::2 lladdr 00:aa:bb:cc:dd:ee nud perm dev nh2"
+	run_cmd "ip -n ns2 -6 route add 2001:db8:301::/64
+		nexthop via 2001:db8:201::2 nexthop via 2001:db8:202::2"
+	run_cmd "ip netns exec ns2 sysctl -qw net.ipv6.fib_multipath_hash_policy=1"
+	set +e
+
+	local dmac=$(ip -n ns2 -j link show dev veth2 | jq -r '.[]["address"]')
+	local tmp_file=$(mktemp)
+	local cmd="ip netns exec ns1 mausezahn -6 veth1 -a own -b $dmac
+		-A 2001:db8:101::1 -B 2001:db8:301::1 -t udp 'sp=12345,dp=0-65535' -q"
+
+	# Packets forwarded in a list using a multipath route must not reuse a
+	# cached result so that a flow always hits the same nexthop. In other
+	# words, the FIB lookup tracepoint needs to be triggered for every
+	# packet.
+	run_cmd "perf stat -e fib6:fib6_table_lookup --filter 'err == 0' -j -o $tmp_file -- $cmd"
+	list_rcv_eval "Multipath" $tmp_file 65536 ">= 0.95"
+
+	# The same is not true for a single path route.
+	run_cmd "ip -n ns2 route replace 2001:db8:301::/64 nexthop via 2001:db8:201::2"
+	run_cmd "perf stat -e fib6:fib6_table_lookup --filter 'err == 0' -j -o $tmp_file -- $cmd"
+	list_rcv_eval "Single path" $tmp_file 65536 "< 0.95"
+
+	rm $tmp_file
+	route_cleanup
+}
+
 ################################################################################
 # usage
 
@@ -2217,6 +2363,8 @@  do
 	ipv4_mangle)			ipv4_mangle_test;;
 	ipv6_mangle)			ipv6_mangle_test;;
 	ipv4_bcast_neigh)		ipv4_bcast_neigh_test;;
+	ipv4_mpath_list)		ipv4_mpath_list_test;;
+	ipv6_mpath_list)		ipv6_mpath_list_test;;
 
 	help) echo "Test names: $TESTS"; exit 0;;
 	esac