diff mbox series

tcp/dccp: replace using only even ports with all ports

Message ID 20240722094119.31128-1-xiaolinkui@126.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series tcp/dccp: replace using only even ports with all ports | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 273 this patch: 273
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 281 this patch: 281
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 281 this patch: 281
netdev/checkpatch warning WARNING: line length of 82 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-07-22--15-00 (tests: 695)

Commit Message

xiaolinkui July 22, 2024, 9:41 a.m. UTC
From: Linkui Xiao <xiaolinkui@kylinos.com>

In commit 207184853dbd ("tcp/dccp: change source port selection at connect()
time"), the purpose is to address the issue of increased costs when all even
ports are in use.

But in my testing environment, this more cost issue has not been resolved.

The testing environment is as follows:
1. build an HTTP server(http://192.168.55.1:9999/);
2. on the client side, use the ab command to test the number of connections,
then kill it and simulate a large number of TIME-WAIT connections:

TARGET_TIME_WAIT=16384
CONCURRENCY=20000
MAX_CONCURRENCY=20000
MIN_CONCURRENCY=5000

while true; do
  CURRENT_TIME_WAIT=$(ss -tanp | grep TIME-WAIT | wc -l)
  echo "Current TIME_WAIT connections: $CURRENT_TIME_WAIT"

  if [ "$CURRENT_TIME_WAIT" -lt "$TARGET_TIME_WAIT" ]; then
    if [ "$CONCURRENCY" -lt "$MAX_CONCURRENCY" ]; then
      CONCURRENCY=$((CONCURRENCY + 5000))
      if [ "$CONCURRENCY" -gt "$MAX_CONCURRENCY" ]; then
        CONCURRENCY=$MAX_CONCURRENCY
      fi
      echo "Increasing concurrency to: $CONCURRENCY"
    fi
  elif [ "$CURRENT_TIME_WAIT" -gt "$TARGET_TIME_WAIT" ]; then
    if [ "$CONCURRENCY" -gt "$MIN_CONCURRENCY" ]; then
      CONCURRENCY=$((CONCURRENCY - 5000))
      if [ "$CONCURRENCY" -lt "$MIN_CONCURRENCY" ]; then
        CONCURRENCY=$MIN_CONCURRENCY
      fi
      echo "Decreasing concurrency to: $CONCURRENCY"
    fi
  fi

  ab -r -n 100000 -c "$CONCURRENCY" http://192.168.55.1:9999/ &

  AB_PID=$!
  sleep 1
  kill $AB_PID
  sleep 1
done

On the client side, use the command "mpstat - P ALL 1" to monitor the load
situation.It can be observed that the load of %sys decreased by about 50%
after patching.

Signed-off-by: Linkui Xiao <xiaolinkui@kylinos.com>
---
 net/ipv4/inet_hashtables.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Eric Dumazet July 22, 2024, 2:50 p.m. UTC | #1
On Mon, Jul 22, 2024 at 2:41 AM <xiaolinkui@126.com> wrote:
>
> From: Linkui Xiao <xiaolinkui@kylinos.com>
>
> In commit 207184853dbd ("tcp/dccp: change source port selection at connect()
> time"), the purpose is to address the issue of increased costs when all even
> ports are in use.
>
> But in my testing environment, this more cost issue has not been resolved.

You missed the whole point of 1580ab63fc9a ("tcp/dccp: better use of
ephemeral ports in connect()")

Have you read 207184853dbd ("tcp/dccp: change source port selection at
connect() ..." changelog and are you using IP_LOCAL_PORT_RANGE ?
Eric Dumazet July 25, 2024, 7:32 a.m. UTC | #2
On Thu, Jul 25, 2024 at 9:07 AM xiaolinkui <xiaolinkui@126.com> wrote:
>
> Thank you for your reply.
>
> At 2024-07-22 21:50:39, "Eric Dumazet" <edumazet@google.com> wrote:
> >On Mon, Jul 22, 2024 at 2:41 AM <xiaolinkui@126.com> wrote:
> >>
> >> From: Linkui Xiao <xiaolinkui@kylinos.com>
> >>
> >> In commit 207184853dbd ("tcp/dccp: change source port selection at connect()
> >> time"), the purpose is to address the issue of increased costs when all even
> >> ports are in use.
> >>
> >> But in my testing environment, this more cost issue has not been resolved.
> >
> >You missed the whole point of 1580ab63fc9a ("tcp/dccp: better use of
> >ephemeral ports in connect()")
> >
> >Have you read 207184853dbd ("tcp/dccp: change source port selection at
> >connect() ..." changelog and are you using IP_LOCAL_PORT_RANGE ?
>
> There seems to be some difference between IP_LOCAL_PORT_RANGE
> and "sysctl net.ipv4.ip_local_port_range".We can use the following system
> calls at the user layer to use IP_LOCAL_PORT_RANGE:
> setsockopt(sockfd, IPPROTO_IP, IP_LOCAL_PORT_RANGE, &opt, sizeof(opt));
>
> But user behavior is uncontrollable.Is there any other way to use IP_LOCAL_PORT_RANGE?

If user behavior can not be changed, this is on their end.

Sorry, we won't accept a patch going to the terrible situation we had
before, where applications would fail completely in many cases.
diff mbox series

Patch

diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 48d0d494185b..4192531ba2d3 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -1007,7 +1007,7 @@  int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 	u32 remaining, offset;
 	int ret, i, low, high;
 	bool local_ports;
-	int step, l3mdev;
+	int l3mdev;
 	u32 index;
 
 	if (port) {
@@ -1020,7 +1020,6 @@  int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 	l3mdev = inet_sk_bound_l3mdev(sk);
 
 	local_ports = inet_sk_get_local_port_range(sk, &low, &high);
-	step = local_ports ? 1 : 2;
 
 	high++; /* [32768, 60999] -> [32768, 61000[ */
 	remaining = high - low;
@@ -1041,7 +1040,7 @@  int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 		offset &= ~1U;
 other_parity_scan:
 	port = low + offset;
-	for (i = 0; i < remaining; i += step, port += step) {
+	for (i = 0; i < remaining; i += 1, port += 1) {
 		if (unlikely(port >= high))
 			port -= remaining;
 		if (inet_is_local_reserved_port(net, port))
@@ -1108,8 +1107,8 @@  int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 	 * on low contention the randomness is maximal and on high contention
 	 * it may be inexistent.
 	 */
-	i = max_t(int, i, get_random_u32_below(8) * step);
-	WRITE_ONCE(table_perturb[index], READ_ONCE(table_perturb[index]) + i + step);
+	i = max_t(int, i, get_random_u32_below(8) * 1);
+	WRITE_ONCE(table_perturb[index], READ_ONCE(table_perturb[index]) + i + 1);
 
 	/* Head lock still held and bh's disabled */
 	inet_bind_hash(sk, tb, tb2, port);