Message ID | Y44xdN3zH4f+BZCD@zwp-5820-Tower (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC] tcp: correct srtt and mdev_us calculation | expand |
On Mon, Dec 5, 2022 at 6:59 PM Weiping Zhang <zhangweiping@didiglobal.com> wrote: > > From the comments we can see that, rtt = 7/8 rtt + 1/8 new, > but there is an mistake, > > m -= (srtt >> 3); > srtt += m; > > explain: > m -= (srtt >> 3); //use t stands for new m > t = m - srtt/8; > > srtt = srtt + t > = srtt + m - srtt/8 > = srtt 7/8 + m > > Test code: > > #include<stdio.h> > > #define u32 unsigned int > > static void test_old(u32 srtt, long mrtt_us) > { > long m = mrtt_us; > u32 old = srtt; > > m -= (srtt >> 3); > srtt += m; > > printf("%s old_srtt: %u mrtt_us: %ld new_srtt: %u\n", __func__, old, mrtt_us, srtt); > } > > static void test_new(u32 srtt, long mrtt_us) > { > long m = mrtt_us; > u32 old = srtt; > > m = ((m - srtt) >> 3); > srtt += m; > > printf("%s old_srtt: %u mrtt_us: %ld new_srtt: %u\n", __func__, old, mrtt_us, srtt); > } > > int main(int argc, char **argv) > { > u32 srtt = 100; > long mrtt_us = 90; > > test_old(srtt, mrtt_us); > test_new(srtt, mrtt_us); > > return 0; > } > > ./a.out > test_old old_srtt: 100 mrtt_us: 90 new_srtt: 178 > test_new old_srtt: 100 mrtt_us: 90 new_srtt: 98 > > Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com> > --- > net/ipv4/tcp_input.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 0640453fce54..0242bb31e1ce 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -848,7 +848,7 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) > * that VJ failed to avoid. 8) > */ > if (srtt != 0) { > - m -= (srtt >> 3); /* m is now error in rtt est */ > + m = (m - srtt >> 3); /* m is now error in rtt est */ > srtt += m; /* rtt = 7/8 rtt + 1/8 new */ > if (m < 0) { > m = -m; /* m is now abs(error) */ > @@ -864,7 +864,7 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) > if (m > 0) > m >>= 3; > } else { > - m -= (tp->mdev_us >> 2); /* similar update on mdev */ > + m = (m - tp->mdev_us >> 2); /* similar update on mdev */ > } > tp->mdev_us += m; /* mdev = 3/4 mdev + 1/4 new */ > if (tp->mdev_us > tp->mdev_max_us) { > -- > 2.34.1 > Sorry, this makes no sense to me.
On Mon, Dec 5, 2022 at 1:02 PM Weiping Zhang <zhangweiping@didiglobal.com> wrote: > > From the comments we can see that, rtt = 7/8 rtt + 1/8 new, > but there is an mistake, > > m -= (srtt >> 3); > srtt += m; > > explain: > m -= (srtt >> 3); //use t stands for new m > t = m - srtt/8; > > srtt = srtt + t > = srtt + m - srtt/8 > = srtt 7/8 + m > > Test code: > > #include<stdio.h> > > #define u32 unsigned int > > static void test_old(u32 srtt, long mrtt_us) > { > long m = mrtt_us; > u32 old = srtt; > > m -= (srtt >> 3); > srtt += m; > > printf("%s old_srtt: %u mrtt_us: %ld new_srtt: %u\n", __func__, old, mrtt_us, srtt); > } > > static void test_new(u32 srtt, long mrtt_us) > { > long m = mrtt_us; > u32 old = srtt; > > m = ((m - srtt) >> 3); > srtt += m; > > printf("%s old_srtt: %u mrtt_us: %ld new_srtt: %u\n", __func__, old, mrtt_us, srtt); > } > > int main(int argc, char **argv) > { > u32 srtt = 100; > long mrtt_us = 90; > > test_old(srtt, mrtt_us); > test_new(srtt, mrtt_us); > > return 0; > } > > ./a.out > test_old old_srtt: 100 mrtt_us: 90 new_srtt: 178 > test_new old_srtt: 100 mrtt_us: 90 new_srtt: 98 > > Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com> Please note that this analysis and this test program do not take account of the fact that srtt in the Linux kernel is maintained in a form where it is shifted left by 3 bits, to maintain a 3-bit fractional part. That is why at first glance it would seem there is a missing multiplication of the new sample by 1/8. By not shifting the new sample when it is added to srtt, the new sample is *implicitly* multiplied by 1/8. > net/ipv4/tcp_input.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 0640453fce54..0242bb31e1ce 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -848,7 +848,7 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) > * that VJ failed to avoid. 8) > */ > if (srtt != 0) { > - m -= (srtt >> 3); /* m is now error in rtt est */ > + m = (m - srtt >> 3); /* m is now error in rtt est */ > srtt += m; /* rtt = 7/8 rtt + 1/8 new */ > if (m < 0) { > m = -m; /* m is now abs(error) */ > @@ -864,7 +864,7 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) > if (m > 0) > m >>= 3; > } else { > - m -= (tp->mdev_us >> 2); /* similar update on mdev */ > + m = (m - tp->mdev_us >> 2); /* similar update on mdev */ > } > tp->mdev_us += m; /* mdev = 3/4 mdev + 1/4 new */ > if (tp->mdev_us > tp->mdev_max_us) { > -- > 2.34.1 AFAICT this proposed patch does not change the behavior of the code but merely expresses the same operations with slightly different syntax. Am I missing something? :-) thanks, neal
On Mon, Dec 05, 2022 at 02:15 PM -05, Neal Cardwell wrote: > On Mon, Dec 5, 2022 at 1:02 PM Weiping Zhang > <zhangweiping@didiglobal.com> wrote: >> >> From the comments we can see that, rtt = 7/8 rtt + 1/8 new, >> but there is an mistake, >> >> m -= (srtt >> 3); >> srtt += m; >> >> explain: >> m -= (srtt >> 3); //use t stands for new m >> t = m - srtt/8; >> >> srtt = srtt + t >> = srtt + m - srtt/8 >> = srtt 7/8 + m >> >> Test code: >> >> #include<stdio.h> >> >> #define u32 unsigned int >> >> static void test_old(u32 srtt, long mrtt_us) >> { >> long m = mrtt_us; >> u32 old = srtt; >> >> m -= (srtt >> 3); >> srtt += m; >> >> printf("%s old_srtt: %u mrtt_us: %ld new_srtt: %u\n", __func__, old, mrtt_us, srtt); >> } >> >> static void test_new(u32 srtt, long mrtt_us) >> { >> long m = mrtt_us; >> u32 old = srtt; >> >> m = ((m - srtt) >> 3); >> srtt += m; >> >> printf("%s old_srtt: %u mrtt_us: %ld new_srtt: %u\n", __func__, old, mrtt_us, srtt); >> } >> >> int main(int argc, char **argv) >> { >> u32 srtt = 100; >> long mrtt_us = 90; >> >> test_old(srtt, mrtt_us); >> test_new(srtt, mrtt_us); >> >> return 0; >> } >> >> ./a.out >> test_old old_srtt: 100 mrtt_us: 90 new_srtt: 178 >> test_new old_srtt: 100 mrtt_us: 90 new_srtt: 98 >> >> Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com> > > Please note that this analysis and this test program do not take > account of the fact that srtt in the Linux kernel is maintained in a > form where it is shifted left by 3 bits, to maintain a 3-bit > fractional part. That is why at first glance it would seem there is a > missing multiplication of the new sample by 1/8. By not shifting the > new sample when it is added to srtt, the new sample is *implicitly* > multiplied by 1/8. Nifty. And it's documented. struct tcp_sock { … u32 srtt_us; /* smoothed round trip time << 3 in usecs */ Thanks for the hint. >> net/ipv4/tcp_input.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c >> index 0640453fce54..0242bb31e1ce 100644 >> --- a/net/ipv4/tcp_input.c >> +++ b/net/ipv4/tcp_input.c >> @@ -848,7 +848,7 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) >> * that VJ failed to avoid. 8) >> */ >> if (srtt != 0) { >> - m -= (srtt >> 3); /* m is now error in rtt est */ >> + m = (m - srtt >> 3); /* m is now error in rtt est */ >> srtt += m; /* rtt = 7/8 rtt + 1/8 new */ >> if (m < 0) { >> m = -m; /* m is now abs(error) */ >> @@ -864,7 +864,7 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) >> if (m > 0) >> m >>= 3; >> } else { >> - m -= (tp->mdev_us >> 2); /* similar update on mdev */ >> + m = (m - tp->mdev_us >> 2); /* similar update on mdev */ >> } >> tp->mdev_us += m; /* mdev = 3/4 mdev + 1/4 new */ >> if (tp->mdev_us > tp->mdev_max_us) { >> -- >> 2.34.1 > > AFAICT this proposed patch does not change the behavior of the code > but merely expresses the same operations with slightly different > syntax. Am I missing something? :-) I've been wondering about that too. There's a change hiding behind operator precedence. Would be better expressed with explicitly placed parenthesis: m = (m - srtt) >> 3; /* m is now error in rtt est */
Sorry to reply to late caused by bad cold, On Tue, Dec 6, 2022 at 5:29 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > > On Mon, Dec 05, 2022 at 02:15 PM -05, Neal Cardwell wrote: > > On Mon, Dec 5, 2022 at 1:02 PM Weiping Zhang > > <zhangweiping@didiglobal.com> wrote: > >> > >> From the comments we can see that, rtt = 7/8 rtt + 1/8 new, > >> but there is an mistake, > >> > >> m -= (srtt >> 3); > >> srtt += m; > >> > >> explain: > >> m -= (srtt >> 3); //use t stands for new m > >> t = m - srtt/8; > >> > >> srtt = srtt + t > >> = srtt + m - srtt/8 > >> = srtt 7/8 + m > >> > >> Test code: > >> > >> #include<stdio.h> > >> > >> #define u32 unsigned int > >> > >> static void test_old(u32 srtt, long mrtt_us) > >> { > >> long m = mrtt_us; > >> u32 old = srtt; > >> > >> m -= (srtt >> 3); > >> srtt += m; > >> > >> printf("%s old_srtt: %u mrtt_us: %ld new_srtt: %u\n", __func__, old, mrtt_us, srtt); > >> } > >> > >> static void test_new(u32 srtt, long mrtt_us) > >> { > >> long m = mrtt_us; > >> u32 old = srtt; > >> > >> m = ((m - srtt) >> 3); > >> srtt += m; > >> > >> printf("%s old_srtt: %u mrtt_us: %ld new_srtt: %u\n", __func__, old, mrtt_us, srtt); > >> } > >> > >> int main(int argc, char **argv) > >> { > >> u32 srtt = 100; > >> long mrtt_us = 90; > >> > >> test_old(srtt, mrtt_us); > >> test_new(srtt, mrtt_us); > >> > >> return 0; > >> } > >> > >> ./a.out > >> test_old old_srtt: 100 mrtt_us: 90 new_srtt: 178 > >> test_new old_srtt: 100 mrtt_us: 90 new_srtt: 98 > >> > >> Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com> > > > > Please note that this analysis and this test program do not take > > account of the fact that srtt in the Linux kernel is maintained in a > > form where it is shifted left by 3 bits, to maintain a 3-bit > > fractional part. That is why at first glance it would seem there is a > > missing multiplication of the new sample by 1/8. By not shifting the > > new sample when it is added to srtt, the new sample is *implicitly* > > multiplied by 1/8. > > Nifty. And it's documented. > > struct tcp_sock { > … > u32 srtt_us; /* smoothed round trip time << 3 in usecs */ > > Thanks for the hint. > > >> net/ipv4/tcp_input.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > >> index 0640453fce54..0242bb31e1ce 100644 > >> --- a/net/ipv4/tcp_input.c > >> +++ b/net/ipv4/tcp_input.c > >> @@ -848,7 +848,7 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) As comments by Neal, srtt use lowest 3bits for fraction part, the @mrtt_us does not include fraction part, so m - srtt is not suitable, since m -= (srtt >> 3) will get the delta which exclude fraction part, and m will also be used to calculate mdev_us. if we calculate srtt by: srtt = srtt *7/8 + (mrtt_us << 3)* 1 / 8, it's more readable, append 3bits 0 for fraction part. srtt = srtt *7/8 + (mrtt_us << 3)* 1 / 8 srtt *7/8 => srtt - srtt >> 3 (mrtt_us << 3)* 1 / 8 => mrtt_us then srtt = srtt - srtt >> 3 + mrtt_us; srtt = srtt + mrtt_us - srtt >> 3 it's same as current code^_^ my previous patch does not consider the fraction part, it generates a wrong result. I write a new test code to decode the fraction part: test_old old_srtt: 100 => 12.4 mrtt_us: 90 new_srtt: 178 => 22.2 test_new old_srtt: 100 => 12.4 mrtt_us: 90 new_srtt: 98 => 12.2 #include<stdio.h> #include<stdlib.h> #define u32 unsigned int static void test_old(u32 srtt, long mrtt_us) { long m = mrtt_us; u32 old = srtt; m -= (srtt >> 3); srtt += m; printf("%s old_srtt: %8u => %8u.%u mrtt_us: %8ld new_srtt: %8u => %8u.%u\n", __func__, old, old >> 3, old & 7, mrtt_us, srtt, srtt>>3, srtt & 7); //printf("%s old_srtt>>3: %8u mrtt_us: %8ld new_srtt>>3: %u\n", __func__, old>>3, mrtt_us, srtt>>3); } static void test_new(u32 srtt, long mrtt_us) { long m = mrtt_us; u32 old = srtt; m = ((m - srtt) >> 3); srtt += m; printf("%s old_srtt: %8u => %8u.%u mrtt_us: %8ld new_srtt: %8u => %8u.%u\n", __func__, old, old >> 3, old & 7, mrtt_us, srtt, srtt>>3, srtt & 7); //printf("%s old_srtt>>3: %8u mrtt_us: %8ld new_srtt>>3: %u\n", __func__, old>>3, mrtt_us, srtt>>3); } int main(int argc, char **argv) { u32 srtt = atoi(argv[1]); long mrtt_us = atoi(argv[2]); test_old(srtt, mrtt_us); test_new(srtt, mrtt_us); return 0; } > >> * that VJ failed to avoid. 8) > >> */ > >> if (srtt != 0) { > >> - m -= (srtt >> 3); /* m is now error in rtt est */ > >> + m = (m - srtt >> 3); /* m is now error in rtt est */ > >> srtt += m; /* rtt = 7/8 rtt + 1/8 new */ > >> if (m < 0) { > >> m = -m; /* m is now abs(error) */ > >> @@ -864,7 +864,7 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) > >> if (m > 0) > >> m >>= 3; > >> } else { > >> - m -= (tp->mdev_us >> 2); /* similar update on mdev */ > >> + m = (m - tp->mdev_us >> 2); /* similar update on mdev */ > >> } > >> tp->mdev_us += m; /* mdev = 3/4 mdev + 1/4 new */ > >> if (tp->mdev_us > tp->mdev_max_us) { > >> -- > >> 2.34.1 > > > > AFAICT this proposed patch does not change the behavior of the code > > but merely expresses the same operations with slightly different > > syntax. Am I missing something? :-) > > I've been wondering about that too. There's a change hiding behind > operator precedence. Would be better expressed with explicitly placed > parenthesis: > > m = (m - srtt) >> 3; /* m is now error in rtt est */
On Tue, Dec 6, 2022 at 10:29 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: > Nifty. And it's documented. > > struct tcp_sock { > … > u32 srtt_us; /* smoothed round trip time << 3 in usecs */ > > Thanks for the hint. The >> 3 is all over the place... So even without a formal comment, anyone familiar with TCP stack would spot this... include/net/tcp.h:700: return usecs_to_jiffies((tp->srtt_us >> 3) + tp->rttvar_us); include/trace/events/tcp.h:286: __entry->srtt = tp->srtt_us >> 3; include/uapi/linux/bpf.h:6038: __u32 srtt_us; /* smoothed round trip time << 3 in usecs */ include/uapi/linux/bpf.h:6396: __u32 srtt_us; /* Averaged RTT << 3 in usecs */ et/ipv4/tcp.c:3906: info->tcpi_rtt = tp->srtt_us >> 3; net/ipv4/tcp.c:4045: nla_put_u32(stats, TCP_NLA_SRTT, tp->srtt_us >> 3); net/ipv4/tcp_bbr.c:273: if (tp->srtt_us) { /* any RTT sample yet? */ net/ipv4/tcp_bbr.c:274: rtt_us = max(tp->srtt_us >> 3, 1U); ...
On Sat, 10 Dec 2022 18:45:56 +0100 Eric Dumazet <edumazet@google.com> wrote: > On Tue, Dec 6, 2022 at 10:29 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: > > > > Nifty. And it's documented. > > > > struct tcp_sock { > > … > > u32 srtt_us; /* smoothed round trip time << 3 in usecs */ > > > > Thanks for the hint. > > The >> 3 is all over the place... So even without a formal comment, > anyone familiar with TCP stack would spot this... And it should be in every text book already and is in BSD as well. Maybe a link to the SIGCOMM paper would be good for those google deprived people?
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 0640453fce54..0242bb31e1ce 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -848,7 +848,7 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) * that VJ failed to avoid. 8) */ if (srtt != 0) { - m -= (srtt >> 3); /* m is now error in rtt est */ + m = (m - srtt >> 3); /* m is now error in rtt est */ srtt += m; /* rtt = 7/8 rtt + 1/8 new */ if (m < 0) { m = -m; /* m is now abs(error) */ @@ -864,7 +864,7 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) if (m > 0) m >>= 3; } else { - m -= (tp->mdev_us >> 2); /* similar update on mdev */ + m = (m - tp->mdev_us >> 2); /* similar update on mdev */ } tp->mdev_us += m; /* mdev = 3/4 mdev + 1/4 new */ if (tp->mdev_us > tp->mdev_max_us) {