diff mbox series

[net-next,v2,1/3] virtio-net: reprobe csum related fields for skb passed by XDP

Message ID 20230624122604.110958-2-hengqi@linux.alibaba.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series virtio-net: avoid conflicts between XDP and GUEST_CSUM | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 8 this patch: 9
netdev/cc_maintainers warning 1 maintainers not CCed: virtualization@lists.linux-foundation.org
netdev/build_clang fail Errors and warnings before: 8 this patch: 10
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 8 this patch: 9
netdev/checkpatch warning WARNING: line length of 83 exceeds 80 columns WARNING: line length of 84 exceeds 80 columns WARNING: line length of 85 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns WARNING: line length of 91 exceeds 80 columns WARNING: line length of 92 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Heng Qi June 24, 2023, 12:26 p.m. UTC
Currently, the VIRTIO_NET_F_GUEST_CSUM (corresponds to NETIF_F_RXCSUM
for netdev) feature of the virtio-net driver conflicts with the loading
of XDP, which is caused by the problem described in [1][2], that is,
XDP may cause errors in partial csumed-related fields which can lead
to packet dropping.

In addition, when communicating between vm and vm on the same host, the
receiving side vm will receive packets marked as
VIRTIO_NET_HDR_F_NEEDS_CSUM, but after these packets are processed by
XDP, the VIRTIO_NET_HDR_F_NEEDS_CSUM and skb CHECKSUM_PARTIAL flags will
be cleared, causing the packet dropping.

This patch introduces a helper:
1. It will try to solve the above problems in the subsequent patch.
2. It parses UDP/TCP and calculates the pseudo-header checksum
for virtio-net. virtio-net currently does not resolve VLANs nor
SCTP CRC checksum offloading.

[1] commit 18ba58e1c234 ("virtio-net: fail XDP set if guest csum is negotiated")
[2] commit e59ff2c49ae1 ("virtio-net: disable guest csum during XDP set")

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 136 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 136 insertions(+)

Comments

Simon Horman June 24, 2023, 7:28 p.m. UTC | #1
On Sat, Jun 24, 2023 at 08:26:02PM +0800, Heng Qi wrote:

...

> +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> +				      struct sk_buff *skb,
> +				      __u8 flags)

Hi Heng Qi,

Unfortunately this appears to break an x86_64 allmodconfig build
with GCC 12.3.0:

drivers/net/virtio_net.c:1671:12: error: 'virtnet_set_csum_after_xdp' defined but not used [-Werror=unused-function]
 1671 | static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
      |            ^~~~~~~~~~~~~~~~~~~~~~~~~~

--
pw-bot: changes-requested
Heng Qi June 25, 2023, 2:17 a.m. UTC | #2
在 2023/6/25 上午3:28, Simon Horman 写道:
> On Sat, Jun 24, 2023 at 08:26:02PM +0800, Heng Qi wrote:
>
> ...
>
>> +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
>> +				      struct sk_buff *skb,
>> +				      __u8 flags)
> Hi Heng Qi,
>
> Unfortunately this appears to break an x86_64 allmodconfig build
> with GCC 12.3.0:
>
> drivers/net/virtio_net.c:1671:12: error: 'virtnet_set_csum_after_xdp' defined but not used [-Werror=unused-function]

I admit that this is a patch set, you can cherry-pick patches [1/3] and 
[2/3] together
to make it compile without 'defined but not used' warning. If people 
don't want to
separate [1/3] and [2/3], I can squash them into one. :)

Thanks.

>   1671 | static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
>        |            ^~~~~~~~~~~~~~~~~~~~~~~~~~
>
> --
> pw-bot: changes-requested
Jason Wang June 25, 2023, 6:44 a.m. UTC | #3
On Sat, Jun 24, 2023 at 8:26 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
> Currently, the VIRTIO_NET_F_GUEST_CSUM (corresponds to NETIF_F_RXCSUM
> for netdev) feature of the virtio-net driver conflicts with the loading
> of XDP, which is caused by the problem described in [1][2], that is,
> XDP may cause errors in partial csumed-related fields which can lead
> to packet dropping.
>
> In addition, when communicating between vm and vm on the same host, the
> receiving side vm will receive packets marked as
> VIRTIO_NET_HDR_F_NEEDS_CSUM, but after these packets are processed by
> XDP, the VIRTIO_NET_HDR_F_NEEDS_CSUM and skb CHECKSUM_PARTIAL flags will
> be cleared, causing the packet dropping.
>
> This patch introduces a helper:
> 1. It will try to solve the above problems in the subsequent patch.
> 2. It parses UDP/TCP and calculates the pseudo-header checksum
> for virtio-net. virtio-net currently does not resolve VLANs nor
> SCTP CRC checksum offloading.

Do we need to bother? Can we simply use skb_probe_transport_header()
and skb_checksum_help() which can simplify a lot of things?

Thanks

>
> [1] commit 18ba58e1c234 ("virtio-net: fail XDP set if guest csum is negotiated")
> [2] commit e59ff2c49ae1 ("virtio-net: disable guest csum during XDP set")
>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
>  drivers/net/virtio_net.c | 136 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 136 insertions(+)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 5a7f7a76b920..83ab9257043a 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -22,6 +22,7 @@
>  #include <net/route.h>
>  #include <net/xdp.h>
>  #include <net/net_failover.h>
> +#include <net/ip6_checksum.h>
>
>  static int napi_weight = NAPI_POLL_WEIGHT;
>  module_param(napi_weight, int, 0444);
> @@ -1568,6 +1569,141 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash,
>         skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type);
>  }
>
> +static int virtnet_flow_dissect_udp_tcp(struct virtnet_info *vi, struct sk_buff *skb)
> +{
> +       struct net_device *dev = vi->dev;
> +       struct flow_keys_basic keys;
> +       struct udphdr *uh;
> +       struct tcphdr *th;
> +       int len, offset;
> +
> +       /* The flow dissector needs this information. */
> +       skb->dev = dev;
> +       skb_reset_mac_header(skb);
> +       skb->protocol = dev_parse_header_protocol(skb);
> +       /* virtio-net does not need to resolve VLAN. */
> +       skb_set_network_header(skb, ETH_HLEN);
> +       if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys,
> +                                             NULL, 0, 0, 0, 0))
> +               return -EINVAL;
> +
> +       /* 1. Pseudo-header checksum calculation requires:
> +        *    (1) saddr/daddr (2) IP_PROTO (3) length of transport payload
> +        * 2. We don't parse SCTP because virtio-net currently doesn't
> +        *    support CRC checksum offloading for SCTP.
> +        */
> +       if (keys.basic.n_proto == htons(ETH_P_IP)) {
> +               struct iphdr *iph;
> +
> +               /* Flow dissector has verified that there is an IP header.
> +                * So we do not need a pskb_may_pull().
> +                */
> +               iph = ip_hdr(skb);
> +               if (iph->version != 4)
> +                       return -EINVAL;
> +
> +               skb->transport_header = skb->mac_header + keys.control.thoff;
> +               offset = skb_transport_offset(skb);
> +               len = skb->len - offset;
> +               if (keys.basic.ip_proto == IPPROTO_UDP) {
> +                       if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
> +                               return -EINVAL;
> +
> +                       uh = udp_hdr(skb);
> +                       skb->csum_offset = offsetof(struct udphdr, check);
> +                       /* Although uh->len is already the 3rd parameter for the calculation
> +                        * of the pseudo-header checksum, we have already calculated the
> +                        * length of the transport layer, so use 'len' here directly.
> +                        */
> +                       uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
> +                                       IPPROTO_UDP, 0);
> +               } else if (keys.basic.ip_proto == IPPROTO_TCP) {
> +                       if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
> +                               return -EINVAL;
> +
> +                       th = tcp_hdr(skb);
> +                       skb->csum_offset = offsetof(struct tcphdr, check);
> +                       th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
> +                                       IPPROTO_TCP, 0);
> +               } /* virtio-net doesn't support checksums for SCTP crc hw offloading.*/
> +       } else if (keys.basic.n_proto == htons(ETH_P_IPV6)) {
> +               struct ipv6hdr *ip6h;
> +
> +               ip6h = ipv6_hdr(skb);
> +               if (ip6h->version != 6)
> +                       return -EINVAL;
> +
> +               /* We have skipped the possible extension headers for IPv6.
> +                * If there is a Routing Header, the tx's check value is calculated by
> +                * final_dst, and that value is the rx's daddr.
> +                */
> +               skb->transport_header = skb->mac_header + keys.control.thoff;
> +               offset = skb_transport_offset(skb);
> +               len = skb->len - offset;
> +               if (keys.basic.ip_proto == IPPROTO_UDP) {
> +                       if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
> +                               return -EINVAL;
> +
> +                       uh = udp_hdr(skb);
> +                       skb->csum_offset = offsetof(struct udphdr, check);
> +                       uh->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
> +                                       (const struct in6_addr *)&ip6h->daddr,
> +                                       len, IPPROTO_UDP, 0);
> +               } else if (keys.basic.ip_proto == IPPROTO_TCP) {
> +                       if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
> +                               return -EINVAL;
> +
> +                       th = tcp_hdr(skb);
> +                       skb->csum_offset = offsetof(struct tcphdr, check);
> +                       th->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
> +                                       (const struct in6_addr *)&ip6h->daddr,
> +                                       len, IPPROTO_TCP, 0);
> +               }
> +       }
> +
> +       skb->csum_start = skb->transport_header;
> +
> +       return 0;
> +}
> +
> +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> +                                     struct sk_buff *skb,
> +                                     __u8 flags)
> +{
> +       int err;
> +
> +       /* When XDP program is loaded, for example, the vm-vm scenario
> +        * on the same host, packets marked as VIRTIO_NET_HDR_F_NEEDS_CSUM
> +        * will travel. Although these packets are safe from the point of
> +        * view of the vm, to avoid modification by XDP and successful
> +        * forwarding in the upper layer, we re-probe the necessary checksum
> +        * related information: skb->csum_{start, offset}, pseudo-header csum.
> +        *
> +        * This benefits us:
> +        * 1. XDP can be loaded when there's _F_GUEST_CSUM.
> +        * 2. The device verifies the checksum of packets , especially
> +        *    benefiting for large packets.
> +        * 3. In the same-host vm-vm scenario, packets marked as
> +        *    VIRTIO_NET_HDR_F_NEEDS_CSUM are no longer dropped after being
> +        *    processed by XDP.
> +        */
> +       if (flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> +               err = virtnet_flow_dissect_udp_tcp(vi, skb);
> +               if (err < 0)
> +                       return -EINVAL;
> +
> +               skb->ip_summed = CHECKSUM_PARTIAL;
> +       } else if (flags & VIRTIO_NET_HDR_F_DATA_VALID) {
> +               /* We want to benefit from this: XDP guarantees that packets marked
> +                * as VIRTIO_NET_HDR_F_DATA_VALID still have correct csum after they
> +                * are processed.
> +                */
> +               skb->ip_summed = CHECKSUM_UNNECESSARY;
> +       }
> +
> +       return 0;
> +}
> +
>  static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
>                         void *buf, unsigned int len, void **ctx,
>                         unsigned int *xdp_xmit,
> --
> 2.19.1.6.gb485710b
>
Heng Qi June 25, 2023, 7:27 a.m. UTC | #4
On Sun, Jun 25, 2023 at 02:44:07PM +0800, Jason Wang wrote:
> On Sat, Jun 24, 2023 at 8:26 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >
> > Currently, the VIRTIO_NET_F_GUEST_CSUM (corresponds to NETIF_F_RXCSUM
> > for netdev) feature of the virtio-net driver conflicts with the loading
> > of XDP, which is caused by the problem described in [1][2], that is,
> > XDP may cause errors in partial csumed-related fields which can lead
> > to packet dropping.
> >
> > In addition, when communicating between vm and vm on the same host, the
> > receiving side vm will receive packets marked as
> > VIRTIO_NET_HDR_F_NEEDS_CSUM, but after these packets are processed by
> > XDP, the VIRTIO_NET_HDR_F_NEEDS_CSUM and skb CHECKSUM_PARTIAL flags will
> > be cleared, causing the packet dropping.
> >
> > This patch introduces a helper:
> > 1. It will try to solve the above problems in the subsequent patch.
> > 2. It parses UDP/TCP and calculates the pseudo-header checksum
> > for virtio-net. virtio-net currently does not resolve VLANs nor
> > SCTP CRC checksum offloading.
> 
> Do we need to bother? Can we simply use skb_probe_transport_header()
> and skb_checksum_help() which can simplify a lot of things?

We need to. We only compute partial checksums (not complete checksums) for
packets marked as NEEDS_CSUM. Please see skb_csum_unnecessary(), which will
consider packets with the partial checksum to be valid by the
protocol stack (Because it believes that the communication between the
VMs of the same host is reliable.).

In order to calculate the pseudo-header checksum, we need the IP address and the
\field{check} position of the transport layer header.
skb_probe_transport_header() only finds out the location of the
transport header and does not provide us with other information we
need. skb_checksum_help() is to calculate the complete checksum on the
tx path, which is not our purpose.

Thanks!

> 
> Thanks
> 
> >
> > [1] commit 18ba58e1c234 ("virtio-net: fail XDP set if guest csum is negotiated")
> > [2] commit e59ff2c49ae1 ("virtio-net: disable guest csum during XDP set")
> >
> > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > ---
> >  drivers/net/virtio_net.c | 136 +++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 136 insertions(+)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 5a7f7a76b920..83ab9257043a 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -22,6 +22,7 @@
> >  #include <net/route.h>
> >  #include <net/xdp.h>
> >  #include <net/net_failover.h>
> > +#include <net/ip6_checksum.h>
> >
> >  static int napi_weight = NAPI_POLL_WEIGHT;
> >  module_param(napi_weight, int, 0444);
> > @@ -1568,6 +1569,141 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash,
> >         skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type);
> >  }
> >
> > +static int virtnet_flow_dissect_udp_tcp(struct virtnet_info *vi, struct sk_buff *skb)
> > +{
> > +       struct net_device *dev = vi->dev;
> > +       struct flow_keys_basic keys;
> > +       struct udphdr *uh;
> > +       struct tcphdr *th;
> > +       int len, offset;
> > +
> > +       /* The flow dissector needs this information. */
> > +       skb->dev = dev;
> > +       skb_reset_mac_header(skb);
> > +       skb->protocol = dev_parse_header_protocol(skb);
> > +       /* virtio-net does not need to resolve VLAN. */
> > +       skb_set_network_header(skb, ETH_HLEN);
> > +       if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys,
> > +                                             NULL, 0, 0, 0, 0))
> > +               return -EINVAL;
> > +
> > +       /* 1. Pseudo-header checksum calculation requires:
> > +        *    (1) saddr/daddr (2) IP_PROTO (3) length of transport payload
> > +        * 2. We don't parse SCTP because virtio-net currently doesn't
> > +        *    support CRC checksum offloading for SCTP.
> > +        */
> > +       if (keys.basic.n_proto == htons(ETH_P_IP)) {
> > +               struct iphdr *iph;
> > +
> > +               /* Flow dissector has verified that there is an IP header.
> > +                * So we do not need a pskb_may_pull().
> > +                */
> > +               iph = ip_hdr(skb);
> > +               if (iph->version != 4)
> > +                       return -EINVAL;
> > +
> > +               skb->transport_header = skb->mac_header + keys.control.thoff;
> > +               offset = skb_transport_offset(skb);
> > +               len = skb->len - offset;
> > +               if (keys.basic.ip_proto == IPPROTO_UDP) {
> > +                       if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
> > +                               return -EINVAL;
> > +
> > +                       uh = udp_hdr(skb);
> > +                       skb->csum_offset = offsetof(struct udphdr, check);
> > +                       /* Although uh->len is already the 3rd parameter for the calculation
> > +                        * of the pseudo-header checksum, we have already calculated the
> > +                        * length of the transport layer, so use 'len' here directly.
> > +                        */
> > +                       uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
> > +                                       IPPROTO_UDP, 0);
> > +               } else if (keys.basic.ip_proto == IPPROTO_TCP) {
> > +                       if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
> > +                               return -EINVAL;
> > +
> > +                       th = tcp_hdr(skb);
> > +                       skb->csum_offset = offsetof(struct tcphdr, check);
> > +                       th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
> > +                                       IPPROTO_TCP, 0);
> > +               } /* virtio-net doesn't support checksums for SCTP crc hw offloading.*/
> > +       } else if (keys.basic.n_proto == htons(ETH_P_IPV6)) {
> > +               struct ipv6hdr *ip6h;
> > +
> > +               ip6h = ipv6_hdr(skb);
> > +               if (ip6h->version != 6)
> > +                       return -EINVAL;
> > +
> > +               /* We have skipped the possible extension headers for IPv6.
> > +                * If there is a Routing Header, the tx's check value is calculated by
> > +                * final_dst, and that value is the rx's daddr.
> > +                */
> > +               skb->transport_header = skb->mac_header + keys.control.thoff;
> > +               offset = skb_transport_offset(skb);
> > +               len = skb->len - offset;
> > +               if (keys.basic.ip_proto == IPPROTO_UDP) {
> > +                       if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
> > +                               return -EINVAL;
> > +
> > +                       uh = udp_hdr(skb);
> > +                       skb->csum_offset = offsetof(struct udphdr, check);
> > +                       uh->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
> > +                                       (const struct in6_addr *)&ip6h->daddr,
> > +                                       len, IPPROTO_UDP, 0);
> > +               } else if (keys.basic.ip_proto == IPPROTO_TCP) {
> > +                       if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
> > +                               return -EINVAL;
> > +
> > +                       th = tcp_hdr(skb);
> > +                       skb->csum_offset = offsetof(struct tcphdr, check);
> > +                       th->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
> > +                                       (const struct in6_addr *)&ip6h->daddr,
> > +                                       len, IPPROTO_TCP, 0);
> > +               }
> > +       }
> > +
> > +       skb->csum_start = skb->transport_header;
> > +
> > +       return 0;
> > +}
> > +
> > +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> > +                                     struct sk_buff *skb,
> > +                                     __u8 flags)
> > +{
> > +       int err;
> > +
> > +       /* When XDP program is loaded, for example, the vm-vm scenario
> > +        * on the same host, packets marked as VIRTIO_NET_HDR_F_NEEDS_CSUM
> > +        * will travel. Although these packets are safe from the point of
> > +        * view of the vm, to avoid modification by XDP and successful
> > +        * forwarding in the upper layer, we re-probe the necessary checksum
> > +        * related information: skb->csum_{start, offset}, pseudo-header csum.
> > +        *
> > +        * This benefits us:
> > +        * 1. XDP can be loaded when there's _F_GUEST_CSUM.
> > +        * 2. The device verifies the checksum of packets , especially
> > +        *    benefiting for large packets.
> > +        * 3. In the same-host vm-vm scenario, packets marked as
> > +        *    VIRTIO_NET_HDR_F_NEEDS_CSUM are no longer dropped after being
> > +        *    processed by XDP.
> > +        */
> > +       if (flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> > +               err = virtnet_flow_dissect_udp_tcp(vi, skb);
> > +               if (err < 0)
> > +                       return -EINVAL;
> > +
> > +               skb->ip_summed = CHECKSUM_PARTIAL;
> > +       } else if (flags & VIRTIO_NET_HDR_F_DATA_VALID) {
> > +               /* We want to benefit from this: XDP guarantees that packets marked
> > +                * as VIRTIO_NET_HDR_F_DATA_VALID still have correct csum after they
> > +                * are processed.
> > +                */
> > +               skb->ip_summed = CHECKSUM_UNNECESSARY;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> >  static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
> >                         void *buf, unsigned int len, void **ctx,
> >                         unsigned int *xdp_xmit,
> > --
> > 2.19.1.6.gb485710b
> >
Michael S. Tsirkin June 25, 2023, 9:53 p.m. UTC | #5
On Sun, Jun 25, 2023 at 10:17:15AM +0800, Heng Qi wrote:
> 
> 
> 在 2023/6/25 上午3:28, Simon Horman 写道:
> > On Sat, Jun 24, 2023 at 08:26:02PM +0800, Heng Qi wrote:
> > 
> > ...
> > 
> > > +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> > > +				      struct sk_buff *skb,
> > > +				      __u8 flags)
> > Hi Heng Qi,
> > 
> > Unfortunately this appears to break an x86_64 allmodconfig build
> > with GCC 12.3.0:
> > 
> > drivers/net/virtio_net.c:1671:12: error: 'virtnet_set_csum_after_xdp' defined but not used [-Werror=unused-function]
> 
> I admit that this is a patch set, you can cherry-pick patches [1/3] and
> [2/3] together
> to make it compile without 'defined but not used' warning. If people don't
> want to
> separate [1/3] and [2/3], I can squash them into one. :)
> 
> Thanks.

the answer is yes, do not break the build.

> >   1671 | static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> >        |            ^~~~~~~~~~~~~~~~~~~~~~~~~~
> > 
> > --
> > pw-bot: changes-requested
Jason Wang June 26, 2023, 2:09 a.m. UTC | #6
On Sun, Jun 25, 2023 at 3:28 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
> On Sun, Jun 25, 2023 at 02:44:07PM +0800, Jason Wang wrote:
> > On Sat, Jun 24, 2023 at 8:26 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> > >
> > > Currently, the VIRTIO_NET_F_GUEST_CSUM (corresponds to NETIF_F_RXCSUM
> > > for netdev) feature of the virtio-net driver conflicts with the loading
> > > of XDP, which is caused by the problem described in [1][2], that is,
> > > XDP may cause errors in partial csumed-related fields which can lead
> > > to packet dropping.
> > >
> > > In addition, when communicating between vm and vm on the same host, the
> > > receiving side vm will receive packets marked as
> > > VIRTIO_NET_HDR_F_NEEDS_CSUM, but after these packets are processed by
> > > XDP, the VIRTIO_NET_HDR_F_NEEDS_CSUM and skb CHECKSUM_PARTIAL flags will
> > > be cleared, causing the packet dropping.
> > >
> > > This patch introduces a helper:
> > > 1. It will try to solve the above problems in the subsequent patch.
> > > 2. It parses UDP/TCP and calculates the pseudo-header checksum
> > > for virtio-net. virtio-net currently does not resolve VLANs nor
> > > SCTP CRC checksum offloading.
> >
> > Do we need to bother? Can we simply use skb_probe_transport_header()
> > and skb_checksum_help() which can simplify a lot of things?
>
> We need to. We only compute partial checksums (not complete checksums) for
> packets marked as NEEDS_CSUM. Please see skb_csum_unnecessary(), which will
> consider packets with the partial checksum to be valid by the
> protocol stack (Because it believes that the communication between the
> VMs of the same host is reliable.).
>
> In order to calculate the pseudo-header checksum, we need the IP address and the
> \field{check} position of the transport layer header.
> skb_probe_transport_header() only finds out the location of the
> transport header and does not provide us with other information we
> need. skb_checksum_help() is to calculate the complete checksum on the
> tx path, which is not our purpose.

A typo in my previous reply, actually, I meant skb_checksum_setup().

I think the point is to have a general purpose helper in the core
instead of duplicating codes in any specific driver.

Thanks

>
> Thanks!
>
> >
> > Thanks
> >
> > >
> > > [1] commit 18ba58e1c234 ("virtio-net: fail XDP set if guest csum is negotiated")
> > > [2] commit e59ff2c49ae1 ("virtio-net: disable guest csum during XDP set")
> > >
> > > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > ---
> > >  drivers/net/virtio_net.c | 136 +++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 136 insertions(+)
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index 5a7f7a76b920..83ab9257043a 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -22,6 +22,7 @@
> > >  #include <net/route.h>
> > >  #include <net/xdp.h>
> > >  #include <net/net_failover.h>
> > > +#include <net/ip6_checksum.h>
> > >
> > >  static int napi_weight = NAPI_POLL_WEIGHT;
> > >  module_param(napi_weight, int, 0444);
> > > @@ -1568,6 +1569,141 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash,
> > >         skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type);
> > >  }
> > >
> > > +static int virtnet_flow_dissect_udp_tcp(struct virtnet_info *vi, struct sk_buff *skb)
> > > +{
> > > +       struct net_device *dev = vi->dev;
> > > +       struct flow_keys_basic keys;
> > > +       struct udphdr *uh;
> > > +       struct tcphdr *th;
> > > +       int len, offset;
> > > +
> > > +       /* The flow dissector needs this information. */
> > > +       skb->dev = dev;
> > > +       skb_reset_mac_header(skb);
> > > +       skb->protocol = dev_parse_header_protocol(skb);
> > > +       /* virtio-net does not need to resolve VLAN. */
> > > +       skb_set_network_header(skb, ETH_HLEN);
> > > +       if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys,
> > > +                                             NULL, 0, 0, 0, 0))
> > > +               return -EINVAL;
> > > +
> > > +       /* 1. Pseudo-header checksum calculation requires:
> > > +        *    (1) saddr/daddr (2) IP_PROTO (3) length of transport payload
> > > +        * 2. We don't parse SCTP because virtio-net currently doesn't
> > > +        *    support CRC checksum offloading for SCTP.
> > > +        */
> > > +       if (keys.basic.n_proto == htons(ETH_P_IP)) {
> > > +               struct iphdr *iph;
> > > +
> > > +               /* Flow dissector has verified that there is an IP header.
> > > +                * So we do not need a pskb_may_pull().
> > > +                */
> > > +               iph = ip_hdr(skb);
> > > +               if (iph->version != 4)
> > > +                       return -EINVAL;
> > > +
> > > +               skb->transport_header = skb->mac_header + keys.control.thoff;
> > > +               offset = skb_transport_offset(skb);
> > > +               len = skb->len - offset;
> > > +               if (keys.basic.ip_proto == IPPROTO_UDP) {
> > > +                       if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
> > > +                               return -EINVAL;
> > > +
> > > +                       uh = udp_hdr(skb);
> > > +                       skb->csum_offset = offsetof(struct udphdr, check);
> > > +                       /* Although uh->len is already the 3rd parameter for the calculation
> > > +                        * of the pseudo-header checksum, we have already calculated the
> > > +                        * length of the transport layer, so use 'len' here directly.
> > > +                        */
> > > +                       uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
> > > +                                       IPPROTO_UDP, 0);
> > > +               } else if (keys.basic.ip_proto == IPPROTO_TCP) {
> > > +                       if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
> > > +                               return -EINVAL;
> > > +
> > > +                       th = tcp_hdr(skb);
> > > +                       skb->csum_offset = offsetof(struct tcphdr, check);
> > > +                       th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
> > > +                                       IPPROTO_TCP, 0);
> > > +               } /* virtio-net doesn't support checksums for SCTP crc hw offloading.*/
> > > +       } else if (keys.basic.n_proto == htons(ETH_P_IPV6)) {
> > > +               struct ipv6hdr *ip6h;
> > > +
> > > +               ip6h = ipv6_hdr(skb);
> > > +               if (ip6h->version != 6)
> > > +                       return -EINVAL;
> > > +
> > > +               /* We have skipped the possible extension headers for IPv6.
> > > +                * If there is a Routing Header, the tx's check value is calculated by
> > > +                * final_dst, and that value is the rx's daddr.
> > > +                */
> > > +               skb->transport_header = skb->mac_header + keys.control.thoff;
> > > +               offset = skb_transport_offset(skb);
> > > +               len = skb->len - offset;
> > > +               if (keys.basic.ip_proto == IPPROTO_UDP) {
> > > +                       if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
> > > +                               return -EINVAL;
> > > +
> > > +                       uh = udp_hdr(skb);
> > > +                       skb->csum_offset = offsetof(struct udphdr, check);
> > > +                       uh->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
> > > +                                       (const struct in6_addr *)&ip6h->daddr,
> > > +                                       len, IPPROTO_UDP, 0);
> > > +               } else if (keys.basic.ip_proto == IPPROTO_TCP) {
> > > +                       if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
> > > +                               return -EINVAL;
> > > +
> > > +                       th = tcp_hdr(skb);
> > > +                       skb->csum_offset = offsetof(struct tcphdr, check);
> > > +                       th->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
> > > +                                       (const struct in6_addr *)&ip6h->daddr,
> > > +                                       len, IPPROTO_TCP, 0);
> > > +               }
> > > +       }
> > > +
> > > +       skb->csum_start = skb->transport_header;
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> > > +                                     struct sk_buff *skb,
> > > +                                     __u8 flags)
> > > +{
> > > +       int err;
> > > +
> > > +       /* When XDP program is loaded, for example, the vm-vm scenario
> > > +        * on the same host, packets marked as VIRTIO_NET_HDR_F_NEEDS_CSUM
> > > +        * will travel. Although these packets are safe from the point of
> > > +        * view of the vm, to avoid modification by XDP and successful
> > > +        * forwarding in the upper layer, we re-probe the necessary checksum
> > > +        * related information: skb->csum_{start, offset}, pseudo-header csum.
> > > +        *
> > > +        * This benefits us:
> > > +        * 1. XDP can be loaded when there's _F_GUEST_CSUM.
> > > +        * 2. The device verifies the checksum of packets , especially
> > > +        *    benefiting for large packets.
> > > +        * 3. In the same-host vm-vm scenario, packets marked as
> > > +        *    VIRTIO_NET_HDR_F_NEEDS_CSUM are no longer dropped after being
> > > +        *    processed by XDP.
> > > +        */
> > > +       if (flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> > > +               err = virtnet_flow_dissect_udp_tcp(vi, skb);
> > > +               if (err < 0)
> > > +                       return -EINVAL;
> > > +
> > > +               skb->ip_summed = CHECKSUM_PARTIAL;
> > > +       } else if (flags & VIRTIO_NET_HDR_F_DATA_VALID) {
> > > +               /* We want to benefit from this: XDP guarantees that packets marked
> > > +                * as VIRTIO_NET_HDR_F_DATA_VALID still have correct csum after they
> > > +                * are processed.
> > > +                */
> > > +               skb->ip_summed = CHECKSUM_UNNECESSARY;
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > >  static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
> > >                         void *buf, unsigned int len, void **ctx,
> > >                         unsigned int *xdp_xmit,
> > > --
> > > 2.19.1.6.gb485710b
> > >
>
Heng Qi June 26, 2023, 3:34 a.m. UTC | #7
在 2023/6/26 上午10:09, Jason Wang 写道:
> On Sun, Jun 25, 2023 at 3:28 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>> On Sun, Jun 25, 2023 at 02:44:07PM +0800, Jason Wang wrote:
>>> On Sat, Jun 24, 2023 at 8:26 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>> Currently, the VIRTIO_NET_F_GUEST_CSUM (corresponds to NETIF_F_RXCSUM
>>>> for netdev) feature of the virtio-net driver conflicts with the loading
>>>> of XDP, which is caused by the problem described in [1][2], that is,
>>>> XDP may cause errors in partial csumed-related fields which can lead
>>>> to packet dropping.
>>>>
>>>> In addition, when communicating between vm and vm on the same host, the
>>>> receiving side vm will receive packets marked as
>>>> VIRTIO_NET_HDR_F_NEEDS_CSUM, but after these packets are processed by
>>>> XDP, the VIRTIO_NET_HDR_F_NEEDS_CSUM and skb CHECKSUM_PARTIAL flags will
>>>> be cleared, causing the packet dropping.
>>>>
>>>> This patch introduces a helper:
>>>> 1. It will try to solve the above problems in the subsequent patch.
>>>> 2. It parses UDP/TCP and calculates the pseudo-header checksum
>>>> for virtio-net. virtio-net currently does not resolve VLANs nor
>>>> SCTP CRC checksum offloading.
>>> Do we need to bother? Can we simply use skb_probe_transport_header()
>>> and skb_checksum_help() which can simplify a lot of things?
>> We need to. We only compute partial checksums (not complete checksums) for
>> packets marked as NEEDS_CSUM. Please see skb_csum_unnecessary(), which will
>> consider packets with the partial checksum to be valid by the
>> protocol stack (Because it believes that the communication between the
>> VMs of the same host is reliable.).
>>
>> In order to calculate the pseudo-header checksum, we need the IP address and the
>> \field{check} position of the transport layer header.
>> skb_probe_transport_header() only finds out the location of the
>> transport header and does not provide us with other information we
>> need. skb_checksum_help() is to calculate the complete checksum on the
>> tx path, which is not our purpose.
> A typo in my previous reply, actually, I meant skb_checksum_setup().

I think this could work for us! I'm going to modify some of the code to 
make it work for us. ^^

>
> I think the point is to have a general purpose helper in the core
> instead of duplicating codes in any specific driver.

Absolutely.

Thanks!

>
> Thanks
>
>> Thanks!
>>
>>> Thanks
>>>
>>>> [1] commit 18ba58e1c234 ("virtio-net: fail XDP set if guest csum is negotiated")
>>>> [2] commit e59ff2c49ae1 ("virtio-net: disable guest csum during XDP set")
>>>>
>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>>> ---
>>>>   drivers/net/virtio_net.c | 136 +++++++++++++++++++++++++++++++++++++++
>>>>   1 file changed, 136 insertions(+)
>>>>
>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>> index 5a7f7a76b920..83ab9257043a 100644
>>>> --- a/drivers/net/virtio_net.c
>>>> +++ b/drivers/net/virtio_net.c
>>>> @@ -22,6 +22,7 @@
>>>>   #include <net/route.h>
>>>>   #include <net/xdp.h>
>>>>   #include <net/net_failover.h>
>>>> +#include <net/ip6_checksum.h>
>>>>
>>>>   static int napi_weight = NAPI_POLL_WEIGHT;
>>>>   module_param(napi_weight, int, 0444);
>>>> @@ -1568,6 +1569,141 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash,
>>>>          skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type);
>>>>   }
>>>>
>>>> +static int virtnet_flow_dissect_udp_tcp(struct virtnet_info *vi, struct sk_buff *skb)
>>>> +{
>>>> +       struct net_device *dev = vi->dev;
>>>> +       struct flow_keys_basic keys;
>>>> +       struct udphdr *uh;
>>>> +       struct tcphdr *th;
>>>> +       int len, offset;
>>>> +
>>>> +       /* The flow dissector needs this information. */
>>>> +       skb->dev = dev;
>>>> +       skb_reset_mac_header(skb);
>>>> +       skb->protocol = dev_parse_header_protocol(skb);
>>>> +       /* virtio-net does not need to resolve VLAN. */
>>>> +       skb_set_network_header(skb, ETH_HLEN);
>>>> +       if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys,
>>>> +                                             NULL, 0, 0, 0, 0))
>>>> +               return -EINVAL;
>>>> +
>>>> +       /* 1. Pseudo-header checksum calculation requires:
>>>> +        *    (1) saddr/daddr (2) IP_PROTO (3) length of transport payload
>>>> +        * 2. We don't parse SCTP because virtio-net currently doesn't
>>>> +        *    support CRC checksum offloading for SCTP.
>>>> +        */
>>>> +       if (keys.basic.n_proto == htons(ETH_P_IP)) {
>>>> +               struct iphdr *iph;
>>>> +
>>>> +               /* Flow dissector has verified that there is an IP header.
>>>> +                * So we do not need a pskb_may_pull().
>>>> +                */
>>>> +               iph = ip_hdr(skb);
>>>> +               if (iph->version != 4)
>>>> +                       return -EINVAL;
>>>> +
>>>> +               skb->transport_header = skb->mac_header + keys.control.thoff;
>>>> +               offset = skb_transport_offset(skb);
>>>> +               len = skb->len - offset;
>>>> +               if (keys.basic.ip_proto == IPPROTO_UDP) {
>>>> +                       if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
>>>> +                               return -EINVAL;
>>>> +
>>>> +                       uh = udp_hdr(skb);
>>>> +                       skb->csum_offset = offsetof(struct udphdr, check);
>>>> +                       /* Although uh->len is already the 3rd parameter for the calculation
>>>> +                        * of the pseudo-header checksum, we have already calculated the
>>>> +                        * length of the transport layer, so use 'len' here directly.
>>>> +                        */
>>>> +                       uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
>>>> +                                       IPPROTO_UDP, 0);
>>>> +               } else if (keys.basic.ip_proto == IPPROTO_TCP) {
>>>> +                       if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
>>>> +                               return -EINVAL;
>>>> +
>>>> +                       th = tcp_hdr(skb);
>>>> +                       skb->csum_offset = offsetof(struct tcphdr, check);
>>>> +                       th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
>>>> +                                       IPPROTO_TCP, 0);
>>>> +               } /* virtio-net doesn't support checksums for SCTP crc hw offloading.*/
>>>> +       } else if (keys.basic.n_proto == htons(ETH_P_IPV6)) {
>>>> +               struct ipv6hdr *ip6h;
>>>> +
>>>> +               ip6h = ipv6_hdr(skb);
>>>> +               if (ip6h->version != 6)
>>>> +                       return -EINVAL;
>>>> +
>>>> +               /* We have skipped the possible extension headers for IPv6.
>>>> +                * If there is a Routing Header, the tx's check value is calculated by
>>>> +                * final_dst, and that value is the rx's daddr.
>>>> +                */
>>>> +               skb->transport_header = skb->mac_header + keys.control.thoff;
>>>> +               offset = skb_transport_offset(skb);
>>>> +               len = skb->len - offset;
>>>> +               if (keys.basic.ip_proto == IPPROTO_UDP) {
>>>> +                       if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
>>>> +                               return -EINVAL;
>>>> +
>>>> +                       uh = udp_hdr(skb);
>>>> +                       skb->csum_offset = offsetof(struct udphdr, check);
>>>> +                       uh->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
>>>> +                                       (const struct in6_addr *)&ip6h->daddr,
>>>> +                                       len, IPPROTO_UDP, 0);
>>>> +               } else if (keys.basic.ip_proto == IPPROTO_TCP) {
>>>> +                       if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
>>>> +                               return -EINVAL;
>>>> +
>>>> +                       th = tcp_hdr(skb);
>>>> +                       skb->csum_offset = offsetof(struct tcphdr, check);
>>>> +                       th->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
>>>> +                                       (const struct in6_addr *)&ip6h->daddr,
>>>> +                                       len, IPPROTO_TCP, 0);
>>>> +               }
>>>> +       }
>>>> +
>>>> +       skb->csum_start = skb->transport_header;
>>>> +
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
>>>> +                                     struct sk_buff *skb,
>>>> +                                     __u8 flags)
>>>> +{
>>>> +       int err;
>>>> +
>>>> +       /* When XDP program is loaded, for example, the vm-vm scenario
>>>> +        * on the same host, packets marked as VIRTIO_NET_HDR_F_NEEDS_CSUM
>>>> +        * will travel. Although these packets are safe from the point of
>>>> +        * view of the vm, to avoid modification by XDP and successful
>>>> +        * forwarding in the upper layer, we re-probe the necessary checksum
>>>> +        * related information: skb->csum_{start, offset}, pseudo-header csum.
>>>> +        *
>>>> +        * This benefits us:
>>>> +        * 1. XDP can be loaded when there's _F_GUEST_CSUM.
>>>> +        * 2. The device verifies the checksum of packets , especially
>>>> +        *    benefiting for large packets.
>>>> +        * 3. In the same-host vm-vm scenario, packets marked as
>>>> +        *    VIRTIO_NET_HDR_F_NEEDS_CSUM are no longer dropped after being
>>>> +        *    processed by XDP.
>>>> +        */
>>>> +       if (flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
>>>> +               err = virtnet_flow_dissect_udp_tcp(vi, skb);
>>>> +               if (err < 0)
>>>> +                       return -EINVAL;
>>>> +
>>>> +               skb->ip_summed = CHECKSUM_PARTIAL;
>>>> +       } else if (flags & VIRTIO_NET_HDR_F_DATA_VALID) {
>>>> +               /* We want to benefit from this: XDP guarantees that packets marked
>>>> +                * as VIRTIO_NET_HDR_F_DATA_VALID still have correct csum after they
>>>> +                * are processed.
>>>> +                */
>>>> +               skb->ip_summed = CHECKSUM_UNNECESSARY;
>>>> +       }
>>>> +
>>>> +       return 0;
>>>> +}
>>>> +
>>>>   static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
>>>>                          void *buf, unsigned int len, void **ctx,
>>>>                          unsigned int *xdp_xmit,
>>>> --
>>>> 2.19.1.6.gb485710b
>>>>
diff mbox series

Patch

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 5a7f7a76b920..83ab9257043a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -22,6 +22,7 @@ 
 #include <net/route.h>
 #include <net/xdp.h>
 #include <net/net_failover.h>
+#include <net/ip6_checksum.h>
 
 static int napi_weight = NAPI_POLL_WEIGHT;
 module_param(napi_weight, int, 0444);
@@ -1568,6 +1569,141 @@  static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash,
 	skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type);
 }
 
+static int virtnet_flow_dissect_udp_tcp(struct virtnet_info *vi, struct sk_buff *skb)
+{
+	struct net_device *dev = vi->dev;
+	struct flow_keys_basic keys;
+	struct udphdr *uh;
+	struct tcphdr *th;
+	int len, offset;
+
+	/* The flow dissector needs this information. */
+	skb->dev = dev;
+	skb_reset_mac_header(skb);
+	skb->protocol = dev_parse_header_protocol(skb);
+	/* virtio-net does not need to resolve VLAN. */
+	skb_set_network_header(skb, ETH_HLEN);
+	if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys,
+					      NULL, 0, 0, 0, 0))
+		return -EINVAL;
+
+	/* 1. Pseudo-header checksum calculation requires:
+	 *    (1) saddr/daddr (2) IP_PROTO (3) length of transport payload
+	 * 2. We don't parse SCTP because virtio-net currently doesn't
+	 *    support CRC checksum offloading for SCTP.
+	 */
+	if (keys.basic.n_proto == htons(ETH_P_IP)) {
+		struct iphdr *iph;
+
+		/* Flow dissector has verified that there is an IP header.
+		 * So we do not need a pskb_may_pull().
+		 */
+		iph = ip_hdr(skb);
+		if (iph->version != 4)
+			return -EINVAL;
+
+		skb->transport_header = skb->mac_header + keys.control.thoff;
+		offset = skb_transport_offset(skb);
+		len = skb->len - offset;
+		if (keys.basic.ip_proto == IPPROTO_UDP) {
+			if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
+				return -EINVAL;
+
+			uh = udp_hdr(skb);
+			skb->csum_offset = offsetof(struct udphdr, check);
+			/* Although uh->len is already the 3rd parameter for the calculation
+			 * of the pseudo-header checksum, we have already calculated the
+			 * length of the transport layer, so use 'len' here directly.
+			 */
+			uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
+					IPPROTO_UDP, 0);
+		} else if (keys.basic.ip_proto == IPPROTO_TCP) {
+			if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
+				return -EINVAL;
+
+			th = tcp_hdr(skb);
+			skb->csum_offset = offsetof(struct tcphdr, check);
+			th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
+					IPPROTO_TCP, 0);
+		} /* virtio-net doesn't support checksums for SCTP crc hw offloading.*/
+	} else if (keys.basic.n_proto == htons(ETH_P_IPV6)) {
+		struct ipv6hdr *ip6h;
+
+		ip6h = ipv6_hdr(skb);
+		if (ip6h->version != 6)
+			return -EINVAL;
+
+		/* We have skipped the possible extension headers for IPv6.
+		 * If there is a Routing Header, the tx's check value is calculated by
+		 * final_dst, and that value is the rx's daddr.
+		 */
+		skb->transport_header = skb->mac_header + keys.control.thoff;
+		offset = skb_transport_offset(skb);
+		len = skb->len - offset;
+		if (keys.basic.ip_proto == IPPROTO_UDP) {
+			if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
+				return -EINVAL;
+
+			uh = udp_hdr(skb);
+			skb->csum_offset = offsetof(struct udphdr, check);
+			uh->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
+					(const struct in6_addr *)&ip6h->daddr,
+					len, IPPROTO_UDP, 0);
+		} else if (keys.basic.ip_proto == IPPROTO_TCP) {
+			if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
+				return -EINVAL;
+
+			th = tcp_hdr(skb);
+			skb->csum_offset = offsetof(struct tcphdr, check);
+			th->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
+					(const struct in6_addr *)&ip6h->daddr,
+					len, IPPROTO_TCP, 0);
+		}
+	}
+
+	skb->csum_start = skb->transport_header;
+
+	return 0;
+}
+
+static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
+				      struct sk_buff *skb,
+				      __u8 flags)
+{
+	int err;
+
+	/* When XDP program is loaded, for example, the vm-vm scenario
+	 * on the same host, packets marked as VIRTIO_NET_HDR_F_NEEDS_CSUM
+	 * will travel. Although these packets are safe from the point of
+	 * view of the vm, to avoid modification by XDP and successful
+	 * forwarding in the upper layer, we re-probe the necessary checksum
+	 * related information: skb->csum_{start, offset}, pseudo-header csum.
+	 *
+	 * This benefits us:
+	 * 1. XDP can be loaded when there's _F_GUEST_CSUM.
+	 * 2. The device verifies the checksum of packets , especially
+	 *    benefiting for large packets.
+	 * 3. In the same-host vm-vm scenario, packets marked as
+	 *    VIRTIO_NET_HDR_F_NEEDS_CSUM are no longer dropped after being
+	 *    processed by XDP.
+	 */
+	if (flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
+		err = virtnet_flow_dissect_udp_tcp(vi, skb);
+		if (err < 0)
+			return -EINVAL;
+
+		skb->ip_summed = CHECKSUM_PARTIAL;
+	} else if (flags & VIRTIO_NET_HDR_F_DATA_VALID) {
+		/* We want to benefit from this: XDP guarantees that packets marked
+		 * as VIRTIO_NET_HDR_F_DATA_VALID still have correct csum after they
+		 * are processed.
+		 */
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+	}
+
+	return 0;
+}
+
 static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
 			void *buf, unsigned int len, void **ctx,
 			unsigned int *xdp_xmit,