diff mbox series

[net-next,v3,1/1] net/udp_gso: Allow TX timestamp with UDP GSO

Message ID 20190528184415.16020-2-fklassen@appneta.com (mailing list archive)
State New
Headers show
Series Allow TX timestamp with UDP GSO | expand

Commit Message

Fred Klassen May 28, 2019, 6:44 p.m. UTC
Fixes an issue where TX Timestamps are not arriving on the error queue
when UDP_SEGMENT CMSG type is combined with CMSG type SO_TIMESTAMPING.
This can be illustrated with an updated updgso_bench_tx program which
includes the '-T' option to test for this condition.

    ./udpgso_bench_tx -4ucTPv -S 1472 -l2 -D 172.16.120.18
    poll timeout
    udp tx:      0 MB/s        1 calls/s      1 msg/s

The "poll timeout" message above indicates that TX timestamp never
arrived.

It also appears that other TX CMSG types cause similar issues, for
example trying to set SOL_IP/IP_TOS.

    ./udpgso_bench_tx -4ucPv -S 1472 -q 182 -l2 -D 172.16.120.18
    poll timeout
    udp tx:      0 MB/s        1 calls/s      1 msg/s

This patch preserves tx_flags for the first UDP GSO segment.

v2: Remove tests as noted by Willem de Bruijn <willemb@google.com>
    Moving tests from net to net-next

v3: Update only relevant tx_flag bits as per
    Willem de Bruijn <willemb@google.com>

Fixes: ee80d1ebe5ba ("udp: add udp gso")
Signed-off-by: Fred Klassen <fklassen@appneta.com>
---
 net/ipv4/udp_offload.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Willem de Bruijn May 28, 2019, 9:44 p.m. UTC | #1
On Tue, May 28, 2019 at 3:10 PM Fred Klassen <fklassen@appneta.com> wrote:
>
> Fixes an issue where TX Timestamps are not arriving on the error queue
> when UDP_SEGMENT CMSG type is combined with CMSG type SO_TIMESTAMPING.
> This can be illustrated with an updated updgso_bench_tx program which
> includes the '-T' option to test for this condition.
>
>     ./udpgso_bench_tx -4ucTPv -S 1472 -l2 -D 172.16.120.18
>     poll timeout
>     udp tx:      0 MB/s        1 calls/s      1 msg/s
>
> The "poll timeout" message above indicates that TX timestamp never
> arrived.
>
> It also appears that other TX CMSG types cause similar issues, for
> example trying to set SOL_IP/IP_TOS.

See previous comment in v2

http://patchwork.ozlabs.org/patch/1105564/

>
>     ./udpgso_bench_tx -4ucPv -S 1472 -q 182 -l2 -D 172.16.120.18
>     poll timeout
>     udp tx:      0 MB/s        1 calls/s      1 msg/s
>
> This patch preserves tx_flags for the first UDP GSO segment.
>
> v2: Remove tests as noted by Willem de Bruijn <willemb@google.com>
>     Moving tests from net to net-next
>
> v3: Update only relevant tx_flag bits as per
>     Willem de Bruijn <willemb@google.com>
>
> Fixes: ee80d1ebe5ba ("udp: add udp gso")
> Signed-off-by: Fred Klassen <fklassen@appneta.com>

FYI, no need for a cover letter for a single patch. Also, I think the
cc list can be more concise. Mainly netdev.

> ---
>  net/ipv4/udp_offload.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index 065334b41d57..de8ecba42d55 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -228,6 +228,11 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
>         seg = segs;
>         uh = udp_hdr(seg);
>
> +       /* preserve TX timestamp and zero-copy info for first segment */

Same as above. This is not about zerocopy.

> +       skb_shinfo(seg)->tskey = skb_shinfo(gso_skb)->tskey;
> +       skb_shinfo(seg)->tx_flags |=
> +                       (skb_shinfo(gso_skb)->tx_flags & SKBTX_ANY_TSTAMP);

Asked elsewhere, but best answered here: given that xmit_more delays
delivery to the NIC until the last segment in a train, is the first
segment in your opinion still the best to attach the timestamp request
to?

To reiterate, we do not want to need a follow-up patch to disable
xmit_more when timestamps are requested.


> +
>         /* compute checksum adjustment based on old length versus new */
>         newlen = htons(sizeof(*uh) + mss);
>         check = csum16_add(csum16_sub(uh->check, uh->len), newlen);
> --
> 2.11.0
>
Fred Klassen May 30, 2019, 4:11 p.m. UTC | #2
> On May 28, 2019, at 2:44 PM, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
>> It also appears that other TX CMSG types cause similar issues, for
>> example trying to set SOL_IP/IP_TOS.
> 
> See previous comment in v2
> 
> http://patchwork.ozlabs.org/patch/1105564/
> 

Sorry, missed those updates. I am still relying too much on my emails.
Will fix for next version, and will scrub through messages on patchwork.

> FYI, no need for a cover letter for a single patch. Also, I think the
> cc list can be more concise. Mainly netdev.

Fixed.

> Same as above. This is not about zerocopy.
> 

Will fix next patchset.


> Asked elsewhere, but best answered here: given that xmit_more delays
> delivery to the NIC until the last segment in a train, is the first
> segment in your opinion still the best to attach the timestamp request
> to?
> 
> To reiterate, we do not want to need a follow-up patch to disable
> xmit_more when timestamps are requested.
> 

I think it would be worthwhile. I was playing with this patch …

+               /* software TX timeststamps are sent immediately */
+               if (tsflags & SKBTX_SW_TSTAMP)
+                       seg->xmit_more = 0;

… which attempts to address this issue. I believe that the patch
should be applied for software timestamps only. However when
I applied in net-next I got the following compile error, which suggests
there is more investigation needed, and therefore requires a separate
patch.

net/ipv4/udp_offload.c: In function ‘__udp_gso_segment’:
net/ipv4/udp_offload.c:251:7: error: ‘struct sk_buff’ has no member named ‘xmit_more’
    seg->xmit_more = 0;
Willem de Bruijn May 30, 2019, 7:17 p.m. UTC | #3
> > Asked elsewhere, but best answered here: given that xmit_more delays
> > delivery to the NIC until the last segment in a train, is the first
> > segment in your opinion still the best to attach the timestamp request
> > to?
> >
> > To reiterate, we do not want to need a follow-up patch to disable
> > xmit_more when timestamps are requested.
> >
>
> I think it would be worthwhile. I was playing with this patch …
>
> +               /* software TX timeststamps are sent immediately */
> +               if (tsflags & SKBTX_SW_TSTAMP)
> +                       seg->xmit_more = 0;
>
> … which attempts to address this issue. I believe that the patch
> should be applied for software timestamps only.

Disagree, sorry.

Timestamped packets should take the same path as non-timestamped, so
that sampled timestamps are representative of the overall workload.

Moreover, due to how xmit_more works, applying the timestamp request
to the last segment will give you exactly the behavior that you are
looking for (bar requeue events): a timestamp before the NIC starts
working on any byte in the request. And that approach will be useful
for measuring host latency as well, unlike timestamping the first
segment.

Timestamping the first, then arguing that it is not useful as is and
requires more changes is the wrong path imho.

Perhaps it is easiest to just not split off a segment from the GSO
train when timestamp that independently. That works today.

> However when
> I applied in net-next I got the following compile error, which suggests
> there is more investigation needed, and therefore requires a separate
> patch.
>
> net/ipv4/udp_offload.c: In function ‘__udp_gso_segment’:
> net/ipv4/udp_offload.c:251:7: error: ‘struct sk_buff’ has no member named ‘xmit_more’
>     seg->xmit_more = 0;

Yes, this has been moved to a percpu variable.
diff mbox series

Patch

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 065334b41d57..de8ecba42d55 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -228,6 +228,11 @@  struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
 	seg = segs;
 	uh = udp_hdr(seg);
 
+	/* preserve TX timestamp and zero-copy info for first segment */
+	skb_shinfo(seg)->tskey = skb_shinfo(gso_skb)->tskey;
+	skb_shinfo(seg)->tx_flags |=
+			(skb_shinfo(gso_skb)->tx_flags & SKBTX_ANY_TSTAMP);
+
 	/* compute checksum adjustment based on old length versus new */
 	newlen = htons(sizeof(*uh) + mss);
 	check = csum16_add(csum16_sub(uh->check, uh->len), newlen);