diff mbox series

[RFC] tcp_bbr2: use correct 64-bit division

Message ID 4740526.31r3eYUQgx@natalenko.name (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series [RFC] tcp_bbr2: use correct 64-bit division | expand

Checks

Context Check Description
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

Oleksandr Natalenko May 22, 2022, 10:29 p.m. UTC
Hello Neal.

It was reported to me [1] by Konstantin (in Cc) that BBRv2 code suffers from integer division issue on 32 bit systems.

Konstantin suggested a solution available in the same linked merge request and copy-pasted by me below for your convenience:

```

Could you please evaluate this report and check whether it is correct, and also check whether the suggested patch is acceptable?

Thanks.

[1] https://gitlab.com/post-factum/pf-kernel/-/merge_requests/6

Comments

David Laight May 24, 2022, 8:01 a.m. UTC | #1
From: Oleksandr Natalenko
> Sent: 22 May 2022 23:30
> To: Neal Cardwell <ncardwell@google.com>
> 
> Hello Neal.
> 
> It was reported to me [1] by Konstantin (in Cc) that BBRv2 code suffers from integer division issue on
> 32 bit systems.

Do any of these divisions ever actually have 64bit operands?
Even on x86-64 64bit divide is significantly slower than 32bit divide.

It is quite clear that x * 8 / 1000 is the same as x / (1000 / 8).
So promoting to 64bit cannot be needed.

	David

> 
> Konstantin suggested a solution available in the same linked merge request and copy-pasted by me below
> for your convenience:
> 
> ```
> diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
> index 664c9e119787..fd3f89e3a8a6 100644
> --- a/net/ipv4/tcp_bbr.c
> +++ b/net/ipv4/tcp_bbr.c
> @@ -312,7 +312,7 @@ static u32 bbr_tso_segs_generic(struct sock *sk, unsigned int mss_now,
>  	bytes = sk->sk_pacing_rate >> sk->sk_pacing_shift;
> 
>  	bytes = min_t(u32, bytes, gso_max_size - 1 - MAX_TCP_HEADER);
> -	segs = max_t(u32, bytes / mss_now, bbr_min_tso_segs(sk));
> +	segs = max_t(u32, div_u64(bytes, mss_now), bbr_min_tso_segs(sk));
>  	return segs;
>  }
> 
> diff --git a/net/ipv4/tcp_bbr2.c b/net/ipv4/tcp_bbr2.c
> index fa49e17c47ca..488429f0f3d0 100644
> --- a/net/ipv4/tcp_bbr2.c
> +++ b/net/ipv4/tcp_bbr2.c
> @@ -588,7 +588,7 @@ static void bbr_debug(struct sock *sk, u32 acked,
>  		 bbr_rate_kbps(sk, bbr_max_bw(sk)), /* bw: max bw */
>  		 0ULL,				    /* lb: [obsolete] */
>  		 0ULL,				    /* ib: [obsolete] */
> -		 (u64)sk->sk_pacing_rate * 8 / 1000,
> +		 div_u64((u64)sk->sk_pacing_rate * 8, 1000),
>  		 acked,
>  		 tcp_packets_in_flight(tp),
>  		 rs->is_ack_delayed ? 'd' : '.',
> @@ -698,7 +698,7 @@ static u32 bbr_tso_segs_generic(struct sock *sk, unsigned int mss_now,
>  	}
> 
>  	bytes = min_t(u32, bytes, gso_max_size - 1 - MAX_TCP_HEADER);
> -	segs = max_t(u32, bytes / mss_now, bbr_min_tso_segs(sk));
> +	segs = max_t(u32, div_u64(bytes, mss_now), bbr_min_tso_segs(sk));
>  	return segs;
>  }
> ```
> 
> Could you please evaluate this report and check whether it is correct, and also check whether the
> suggested patch is acceptable?
> 
> Thanks.
> 
> [1] https://gitlab.com/post-factum/pf-kernel/-/merge_requests/6
> 
> --
> Oleksandr Natalenko (post-factum)
> 

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Neal Cardwell May 24, 2022, 8:05 p.m. UTC | #2
On Tue, May 24, 2022 at 4:01 AM David Laight <David.Laight@aculab.com> wrote:
>
> From: Oleksandr Natalenko
> > Sent: 22 May 2022 23:30
> > To: Neal Cardwell <ncardwell@google.com>
> >
> > Hello Neal.
> >
> > It was reported to me [1] by Konstantin (in Cc) that BBRv2 code suffers from integer division issue on
> > 32 bit systems.
>
> Do any of these divisions ever actually have 64bit operands?
> Even on x86-64 64bit divide is significantly slower than 32bit divide.
>
> It is quite clear that x * 8 / 1000 is the same as x / (1000 / 8).
> So promoting to 64bit cannot be needed.
>
>         David

The sk->sk_pacing_rate can definitely be bigger than 32 bits if the
network path can support more than 34 Gbit/sec  (a pacing rate of 2^32
bytes per sec is roughly 34 Gibt/sec). This definitely happens.

So  this one seems reasonable to me (and is only in debug code, so the
performance is probably fine):
-                (u64)sk->sk_pacing_rate * 8 / 1000,
+                div_u64((u64)sk->sk_pacing_rate * 8, 1000),

For the other two I agree we should rework them to avoid the 64-bit
divide, since we don't need it.

There is similar logic in mainline Linux in tcp_tso_autosize(), which
is currently using "unsigned long" for bytes.

Eric, what do you advise?

thanks,
neal
Eric Dumazet May 24, 2022, 8:22 p.m. UTC | #3
On Tue, May 24, 2022 at 1:06 PM Neal Cardwell <ncardwell@google.com> wrote:
>
> On Tue, May 24, 2022 at 4:01 AM David Laight <David.Laight@aculab.com> wrote:
> >
> > From: Oleksandr Natalenko
> > > Sent: 22 May 2022 23:30
> > > To: Neal Cardwell <ncardwell@google.com>
> > >
> > > Hello Neal.
> > >
> > > It was reported to me [1] by Konstantin (in Cc) that BBRv2 code suffers from integer division issue on
> > > 32 bit systems.
> >
> > Do any of these divisions ever actually have 64bit operands?
> > Even on x86-64 64bit divide is significantly slower than 32bit divide.
> >
> > It is quite clear that x * 8 / 1000 is the same as x / (1000 / 8).
> > So promoting to 64bit cannot be needed.
> >
> >         David
>
> The sk->sk_pacing_rate can definitely be bigger than 32 bits if the
> network path can support more than 34 Gbit/sec  (a pacing rate of 2^32
> bytes per sec is roughly 34 Gibt/sec). This definitely happens.
>
> So  this one seems reasonable to me (and is only in debug code, so the
> performance is probably fine):
> -                (u64)sk->sk_pacing_rate * 8 / 1000,
> +                div_u64((u64)sk->sk_pacing_rate * 8, 1000),
>
> For the other two I agree we should rework them to avoid the 64-bit
> divide, since we don't need it.
>
> There is similar logic in mainline Linux in tcp_tso_autosize(), which
> is currently using "unsigned long" for bytes.
>

Not sure I follow.

sk_pacing_rate is also 'unsigned long'

So tcp_tso_autosize() is correct on 32bit and 64bit arches.
There is no forced 64bit operation there.


> Eric, what do you advise?
>
> thanks,
> neal
diff mbox series

Patch

diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index 664c9e119787..fd3f89e3a8a6 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -312,7 +312,7 @@  static u32 bbr_tso_segs_generic(struct sock *sk, unsigned int mss_now,
 	bytes = sk->sk_pacing_rate >> sk->sk_pacing_shift;
 
 	bytes = min_t(u32, bytes, gso_max_size - 1 - MAX_TCP_HEADER);
-	segs = max_t(u32, bytes / mss_now, bbr_min_tso_segs(sk));
+	segs = max_t(u32, div_u64(bytes, mss_now), bbr_min_tso_segs(sk));
 	return segs;
 }
 
diff --git a/net/ipv4/tcp_bbr2.c b/net/ipv4/tcp_bbr2.c
index fa49e17c47ca..488429f0f3d0 100644
--- a/net/ipv4/tcp_bbr2.c
+++ b/net/ipv4/tcp_bbr2.c
@@ -588,7 +588,7 @@  static void bbr_debug(struct sock *sk, u32 acked,
 		 bbr_rate_kbps(sk, bbr_max_bw(sk)), /* bw: max bw */
 		 0ULL,				    /* lb: [obsolete] */
 		 0ULL,				    /* ib: [obsolete] */
-		 (u64)sk->sk_pacing_rate * 8 / 1000,
+		 div_u64((u64)sk->sk_pacing_rate * 8, 1000),
 		 acked,
 		 tcp_packets_in_flight(tp),
 		 rs->is_ack_delayed ? 'd' : '.',
@@ -698,7 +698,7 @@  static u32 bbr_tso_segs_generic(struct sock *sk, unsigned int mss_now,
 	}
 
 	bytes = min_t(u32, bytes, gso_max_size - 1 - MAX_TCP_HEADER);
-	segs = max_t(u32, bytes / mss_now, bbr_min_tso_segs(sk));
+	segs = max_t(u32, div_u64(bytes, mss_now), bbr_min_tso_segs(sk));
 	return segs;
 }
```