Message ID | 20240313124613.51399-1-mark@yotsuba.nl (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | netpoll: support sending over raw IP interfaces | expand |
On 2024-03-13 at 18:16:13, Mark Cilissen (mark@yotsuba.nl) wrote: > Currently, netpoll only supports interfaces with an ethernet-compatible > link layer. Certain interfaces like SLIP do not have a link layer > on the network interface level at all and expect raw IP packets, > and could benefit from being supported by netpoll. > > This commit adds support for such interfaces by using the network device's > `hard_header_len` field as an indication that no link layer is present. > If that is the case we simply skip adding the ethernet header, causing > a raw IP packet to be sent over the interface. This has been confirmed > to add netconsole support to at least SLIP and WireGuard interfaces. > > Signed-off-by: Mark Cilissen <mark@yotsuba.nl> > --- > Documentation/networking/netconsole.rst | 3 ++- > net/core/netpoll.c | 30 ++++++++++++++++--------- > 2 files changed, 22 insertions(+), 11 deletions(-) > > diff --git a/Documentation/networking/netconsole.rst b/Documentation/networking/netconsole.rst > index d55c2a22ec7a..434ce0366027 100644 > --- a/Documentation/networking/netconsole.rst > +++ b/Documentation/networking/netconsole.rst > @@ -327,4 +327,5 @@ enable the logging of even the most critical kernel bugs. It works > from IRQ contexts as well, and does not enable interrupts while > sending packets. Due to these unique needs, configuration cannot > be more automatic, and some fundamental limitations will remain: > -only IP networks, UDP packets and ethernet devices are supported. > +only UDP packets over IP networks, over ethernet links if a > +hardware layer is required, are supported. > diff --git a/net/core/netpoll.c b/net/core/netpoll.c > index 543007f159f9..0299fb71b456 100644 > --- a/net/core/netpoll.c > +++ b/net/core/netpoll.c > @@ -399,7 +399,7 @@ EXPORT_SYMBOL(netpoll_send_skb); > > void netpoll_send_udp(struct netpoll *np, const char *msg, int len) > { > - int total_len, ip_len, udp_len; > + int total_len, ip_len, udp_len, link_len; > struct sk_buff *skb; > struct udphdr *udph; > struct iphdr *iph; > @@ -416,7 +416,10 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len) > else > ip_len = udp_len + sizeof(*iph); > > - total_len = ip_len + LL_RESERVED_SPACE(np->dev); > + /* if there's a hardware header assume ethernet, else raw IP */ Taking an assumption based on dev's lower layer does not look to be good. why not transmit packet from skb_network_header() in your driver (by making changes in your driver) > + eth = NULL; > + link_len = np->dev->hard_header_len ? LL_RESERVED_SPACE(np->dev) : 0; > + total_len = ip_len + link_len; > > skb = find_skb(np, total_len + np->dev->needed_tailroom, > total_len - len); > @@ -458,9 +461,11 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len) > ip6h->saddr = np->local_ip.in6; > ip6h->daddr = np->remote_ip.in6; > > - eth = skb_push(skb, ETH_HLEN); > - skb_reset_mac_header(skb); > - skb->protocol = eth->h_proto = htons(ETH_P_IPV6); > + skb->protocol = htons(ETH_P_IPV6); > + if (link_len) { > + eth = skb_push(skb, ETH_HLEN); > + skb_reset_mac_header(skb); > + } > } else { > udph->check = 0; > udph->check = csum_tcpudp_magic(np->local_ip.ip, > @@ -487,13 +492,18 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len) > put_unaligned(np->remote_ip.ip, &(iph->daddr)); > iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); > > - eth = skb_push(skb, ETH_HLEN); > - skb_reset_mac_header(skb); > - skb->protocol = eth->h_proto = htons(ETH_P_IP); > + skb->protocol = htons(ETH_P_IP); > + if (link_len) { > + eth = skb_push(skb, ETH_HLEN); > + skb_reset_mac_header(skb); > + } > } > > - ether_addr_copy(eth->h_source, np->dev->dev_addr); > - ether_addr_copy(eth->h_dest, np->remote_mac); > + if (eth) { > + eth->h_proto = skb->protocol; > + ether_addr_copy(eth->h_source, np->dev->dev_addr); > + ether_addr_copy(eth->h_dest, np->remote_mac); > + } > > skb->dev = np->dev; > > -- > 2.43.0 >
Hi Ratheesh, > Op 13 mrt 6 Reiwa, om 14:36 heeft Ratheesh Kannoth <rkannoth@marvell.com> het volgende geschreven: > > On 2024-03-13 at 18:16:13, Mark Cilissen (mark@yotsuba.nl) wrote: >> […] > Taking an assumption based on dev’s lower layer does not look to be good. > why not transmit packet from skb_network_header() in your driver (by making > changes in your driver) There’s two assumptions at play here: - The lower layer is ethernet: this has always been present in netpoll, and is even documented in netconsole.rst. This comment just mentions it because we add a way to bypass the assumption; it is not an assumption this patch adds to the code. - hard_header_len==0 means that there is no exposed link layer: this is a rather conservative assumption in my opinion, and is also mentioned in the definition of LL_RESERVED_SPACE: > * Alternative is: > * dev->hard_header_len ? (dev->hard_header_len + > * (HH_DATA_MOD - 1)) & ~(HH_DATA_MOD - 1) : 0 The same assumption is also made in more places in the core network code, like af_packet: > - If the device has no dev->header_ops->create, there is no LL header > visible above the device. In this case, its hard_header_len should be 0. > The device may prepend its own header internally. In this case, its > needed_headroom should be set to the space needed for it to add its > internal header. I could change it to, like af_packet, check `dev->header_ops` instead if that is preferred, but I don’t think that patching every single raw IP driver to deal with skbs that managed to somehow have link layer data would be preferred here, especially since netpoll is kind of a special case to begin with. I am open to suggestions and ideas, though. > […] Thanks and regards, Mark
> From: Mark <mark@yotsuba.nl> > Sent: Wednesday, March 13, 2024 7:23 PM > To: Ratheesh Kannoth <rkannoth@marvell.com> > > On 2024-03-13 at 18:16:13, Mark Cilissen (mark@yotsuba.nl) wrote: > >> […] > > Taking an assumption based on dev’s lower layer does not look to be good. > > why not transmit packet from skb_network_header() in your driver (by > > making changes in your driver) > > There’s two assumptions at play here: > - The lower layer is ethernet: this has always been present in netpoll, and is > even > documented in netconsole.rst. This comment just mentions it because we > add a way > to bypass the assumption; it is not an assumption this patch adds to the > code. > - hard_header_len==0 means that there is no exposed link layer: this is a rather > conservative assumption in my opinion, and is also mentioned in the > definition Hmm. That is not my question. Let me explain it in detail. Netconsole is using netpoll_send_udp() to encapsulate the msg over UDP/IP/ MAC headers. Job well done. Now it calls netdev->ops->ndo_start_xmit(skb, dev). If your driver is well aware that you can Transmit only from network header, why don’t you dma map from network header ? > of LL_RESERVED_SPACE: > > > * Alternative is: > > * dev->hard_header_len ? (dev->hard_header_len + > > * (HH_DATA_MOD - 1)) & ~(HH_DATA_MOD - 1) : 0 > > The same assumption is also made in more places in the core network code, > like af_packet: > > > - If the device has no dev->header_ops->create, there is no LL header > > visible above the device. In this case, its hard_header_len should be 0. > > The device may prepend its own header internally. In this case, its > > needed_headroom should be set to the space needed for it to add its > > internal header. > ACK.
On Wed, 2024-03-13 at 13:46 +0100, Mark Cilissen wrote: > Currently, netpoll only supports interfaces with an ethernet-compatible > link layer. Certain interfaces like SLIP do not have a link layer > on the network interface level at all and expect raw IP packets, > and could benefit from being supported by netpoll. > > This commit adds support for such interfaces by using the network device's > `hard_header_len` field as an indication that no link layer is present. > If that is the case we simply skip adding the ethernet header, causing > a raw IP packet to be sent over the interface. This has been confirmed > to add netconsole support to at least SLIP and WireGuard interfaces. > > Signed-off-by: Mark Cilissen <mark@yotsuba.nl> ## Form letter - net-next-closed The merge window for v6.9 has begun and we have already posted our pull request. Therefore net-next is closed for new drivers, features, code refactoring and optimizations. We are currently accepting bug fixes only. Please repost when net-next reopens after March 25th. RFC patches sent for review only are obviously welcome at any time. See: https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#development-cycle -- pw-bot: defer
On Wed, 13 Mar 2024 13:46:13 +0100 Mark Cilissen wrote: > Currently, netpoll only supports interfaces with an ethernet-compatible > link layer. Certain interfaces like SLIP do not have a link layer > on the network interface level at all and expect raw IP packets, > and could benefit from being supported by netpoll. > > This commit adds support for such interfaces by using the network device's > `hard_header_len` field as an indication that no link layer is present. > If that is the case we simply skip adding the ethernet header, causing > a raw IP packet to be sent over the interface. This has been confirmed > to add netconsole support to at least SLIP and WireGuard interfaces. Would be great if this could come with a simple selftest under tools/testing/selftests/net. Preferably using some simple tunnel device, rather than wg to limit tooling dependencies ;)
Hi Ratheesh, > Op 14 mrt 6 Reiwa, om 03:46 heeft Ratheesh Kannoth <rkannoth@marvell.com> het volgende geschreven: > >> From: Mark <mark@yotsuba.nl> >> […] > > Hmm. That is not my question. Let me explain it in detail. Netconsole is using netpoll_send_udp() to encapsulate the msg over > UDP/IP/ MAC headers. Job well done. Now it calls netdev->ops->ndo_start_xmit(skb, dev). If your driver is well aware that you can > Transmit only from network header, why don’t you dma map from network header ? The rest of the network subsystem seems to not add a header to skbs submitted to netdev->ops->ndo_start_xmit() at all, which makes sense considering netdev->header_ops is either NULL or no-op for these devices. Following this line of reasoning, from API perspective it made more sense to me for netpoll to not submit ‘bogus’ skbs that are out-of-line with what the rest of the network subsystem does to ndo_start_xmit() to begin with. It really depends on the API guarantees we want to have for netdev, but personally I'm wary of introducing an allowance for bogus headers. Additionally from a practical perspective, this would require changing almost every, if not every, IP interface driver. I took a look at the WireGuard driver to see what it would entail, and from my limited experience with the networking code it seems like there's some quite annoying interactions with e.g. GSO which would make driver-side handling of such packets quite a bit more complex. So from my perspective, fixing this in netpoll is both the more API-correct change and introduces the least amount of additional complexity. > […] Thanks and regards, Mark
Hi Jakub, > Op 14 mrt 6 Reiwa, om 19:34 heeft Jakub Kicinski <kuba@kernel.org> het volgende geschreven: > > On Wed, 13 Mar 2024 13:46:13 +0100 Mark Cilissen wrote: >> Currently, netpoll only supports interfaces with an ethernet-compatible >> link layer. Certain interfaces like SLIP do not have a link layer >> on the network interface level at all and expect raw IP packets, >> and could benefit from being supported by netpoll. >> >> This commit adds support for such interfaces by using the network device's >> `hard_header_len` field as an indication that no link layer is present. >> If that is the case we simply skip adding the ethernet header, causing >> a raw IP packet to be sent over the interface. This has been confirmed >> to add netconsole support to at least SLIP and WireGuard interfaces. > > Would be great if this could come with a simple selftest under > tools/testing/selftests/net. Preferably using some simple tunnel > device, rather than wg to limit tooling dependencies ;) Yes, that makes a lot of sense. I wrote a selftest that tests netconsole using IPIP and GRE tunnels, and they too seem to work correctly and pass (and more importantly, fail without the patch ;-). Would you prefer me to submit a v2 with that test now, or when net-next reopens in a week? Thanks and regards, Mark
> From: Mark <mark@yotsuba.nl> > Sent: Monday, March 18, 2024 5:13 PM > To: Ratheesh Kannoth <rkannoth@marvell.com> > Cc: netdev@vger.kernel.org; Hans de Goede <hdegoede@redhat.com>; Eric > Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Breno > Leitao <leitao@debian.org>; Ingo Molnar <mingo@redhat.com>; David S. > Miller <davem@davemloft.net>; Paolo Abeni <pabeni@redhat.com>; linux- > kernel@vger.kernel.org > Subject: Re: [EXTERNAL] [PATCH] netpoll: support sending over raw IP > interfaces > > > Hi Ratheesh, > > > Op 14 mrt 6 Reiwa, om 03:46 heeft Ratheesh Kannoth > <rkannoth@marvell.com> het volgende geschreven: > > > >> From: Mark <mark@yotsuba.nl> > >> […] > > > > Hmm. That is not my question. Let me explain it in detail. Netconsole is > using netpoll_send_udp() to encapsulate the msg over > > UDP/IP/ MAC headers. Job well done. Now it calls > > netdev->ops->ndo_start_xmit(skb, dev). If your driver is well aware that > you can Transmit only from network header, why don’t you dma map from > network header ? > > The rest of the network subsystem seems to not add a header to skbs > submitted to netdev->ops->ndo_start_xmit() at all, which makes sense > considering > netdev->header_ops is either NULL or no-op for these devices. > > Following this line of reasoning, from API perspective it made more sense to > me for netpoll to not submit ‘bogus’ skbs that are out-of-line with what the > rest of the network subsystem does to ndo_start_xmit() to begin with. > It really depends on the API guarantees we want to have for netdev, but > personally I'm wary of introducing an allowance for bogus headers. > Is below network topology possible ? Netpoll()- ------> netdev A ----> raw interface Where netdev A's netdev->header_ops != NULL > Additionally from a practical perspective, this would require changing almost > every, if not every, IP interface driver. I took a look at the WireGuard driver to > see what it would entail, and from my limited experience with the networking > code it seems like there's some quite annoying interactions with e.g. GSO > which would make driver-side handling of such packets quite a bit more > complex. ACK. > > So from my perspective, fixing this in netpoll is both the more API-correct > change and introduces the least amount of additional complexity. ACK.
On Mon, 18 Mar 2024 12:47:46 +0100 Mark wrote: > > Would be great if this could come with a simple selftest under > > tools/testing/selftests/net. Preferably using some simple tunnel > > device, rather than wg to limit tooling dependencies ;) > > Yes, that makes a lot of sense. I wrote a selftest that tests netconsole > using IPIP and GRE tunnels, and they too seem to work correctly and pass (and > more importantly, fail without the patch ;-). Would you prefer me to submit > a v2 with that test now, or when net-next reopens in a week? Feel free to post as [RFC net-next v2] now. I can't guarantee people will find the time to review, but it shouldn't hurt to have it on the list :)
Hi Ratheesh, > Op 18 mrt 6 Reiwa, om 15:06 heeft Ratheesh Kannoth <rkannoth@marvell.com> het volgende geschreven: > >> […] > Is below network topology possible ? > Netpoll()- ------> netdev A ----> raw interface > Where netdev A's netdev->header_ops != NULL I believe so, this is not uncommon in tunnel devices like gretap. However, those fully encapsulate the link layer header in the packet to the lower interface. I am not aware of a interface driver that removes a header upon xmit, so to speak. However, I have just posted a v2 that instead uses the documented `dev_has_header()` API, which seems to fit the check exactly, here: https://lore.kernel.org/netdev/20240321122003.20089-1-mark@yotsuba.nl/T/ I hope this change manages to mitigate your concerns. :-) > […] Thanks and regards, Mark
> From: Mark <mark@yotsuba.nl> > Sent: Thursday, March 21, 2024 6:03 PM > To: Ratheesh Kannoth <rkannoth@marvell.com> > Cc: netdev@vger.kernel.org; Hans de Goede <hdegoede@redhat.com>; Eric > Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Breno > Leitao <leitao@debian.org>; Ingo Molnar <mingo@redhat.com>; David S. > Miller <davem@davemloft.net>; Paolo Abeni <pabeni@redhat.com>; linux- > kernel@vger.kernel.org > Subject: Re: [EXTERNAL] [PATCH] netpoll: support sending over raw IP > interfaces > > Hi Ratheesh, > > > Op 18 mrt 6 Reiwa, om 15:06 heeft Ratheesh Kannoth > <rkannoth@marvell.com> het volgende geschreven: > > > >> […] > > Is below network topology possible ? > > Netpoll()- ------> netdev A ----> raw interface Where netdev A's > > netdev->header_ops != NULL > > I believe so, this is not uncommon in tunnel devices like gretap. > However, those fully encapsulate the link layer header in the packet to the > lower interface. I am not aware of a interface driver that removes a header > upon xmit, so to speak. However, I have just posted a v2 that instead uses the > documented `dev_has_header()` API, which seems to fit the check exactly, > here: ACK. I understand the complexity to implement it in driver than a easy fix in netpoll.
diff --git a/Documentation/networking/netconsole.rst b/Documentation/networking/netconsole.rst index d55c2a22ec7a..434ce0366027 100644 --- a/Documentation/networking/netconsole.rst +++ b/Documentation/networking/netconsole.rst @@ -327,4 +327,5 @@ enable the logging of even the most critical kernel bugs. It works from IRQ contexts as well, and does not enable interrupts while sending packets. Due to these unique needs, configuration cannot be more automatic, and some fundamental limitations will remain: -only IP networks, UDP packets and ethernet devices are supported. +only UDP packets over IP networks, over ethernet links if a +hardware layer is required, are supported. diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 543007f159f9..0299fb71b456 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -399,7 +399,7 @@ EXPORT_SYMBOL(netpoll_send_skb); void netpoll_send_udp(struct netpoll *np, const char *msg, int len) { - int total_len, ip_len, udp_len; + int total_len, ip_len, udp_len, link_len; struct sk_buff *skb; struct udphdr *udph; struct iphdr *iph; @@ -416,7 +416,10 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len) else ip_len = udp_len + sizeof(*iph); - total_len = ip_len + LL_RESERVED_SPACE(np->dev); + /* if there's a hardware header assume ethernet, else raw IP */ + eth = NULL; + link_len = np->dev->hard_header_len ? LL_RESERVED_SPACE(np->dev) : 0; + total_len = ip_len + link_len; skb = find_skb(np, total_len + np->dev->needed_tailroom, total_len - len); @@ -458,9 +461,11 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len) ip6h->saddr = np->local_ip.in6; ip6h->daddr = np->remote_ip.in6; - eth = skb_push(skb, ETH_HLEN); - skb_reset_mac_header(skb); - skb->protocol = eth->h_proto = htons(ETH_P_IPV6); + skb->protocol = htons(ETH_P_IPV6); + if (link_len) { + eth = skb_push(skb, ETH_HLEN); + skb_reset_mac_header(skb); + } } else { udph->check = 0; udph->check = csum_tcpudp_magic(np->local_ip.ip, @@ -487,13 +492,18 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len) put_unaligned(np->remote_ip.ip, &(iph->daddr)); iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); - eth = skb_push(skb, ETH_HLEN); - skb_reset_mac_header(skb); - skb->protocol = eth->h_proto = htons(ETH_P_IP); + skb->protocol = htons(ETH_P_IP); + if (link_len) { + eth = skb_push(skb, ETH_HLEN); + skb_reset_mac_header(skb); + } } - ether_addr_copy(eth->h_source, np->dev->dev_addr); - ether_addr_copy(eth->h_dest, np->remote_mac); + if (eth) { + eth->h_proto = skb->protocol; + ether_addr_copy(eth->h_source, np->dev->dev_addr); + ether_addr_copy(eth->h_dest, np->remote_mac); + } skb->dev = np->dev;
Currently, netpoll only supports interfaces with an ethernet-compatible link layer. Certain interfaces like SLIP do not have a link layer on the network interface level at all and expect raw IP packets, and could benefit from being supported by netpoll. This commit adds support for such interfaces by using the network device's `hard_header_len` field as an indication that no link layer is present. If that is the case we simply skip adding the ethernet header, causing a raw IP packet to be sent over the interface. This has been confirmed to add netconsole support to at least SLIP and WireGuard interfaces. Signed-off-by: Mark Cilissen <mark@yotsuba.nl> --- Documentation/networking/netconsole.rst | 3 ++- net/core/netpoll.c | 30 ++++++++++++++++--------- 2 files changed, 22 insertions(+), 11 deletions(-)