Message ID | 20210403114848.30528-3-o.rempel@pengutronix.de (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | ar9331: mainline some parts of switch functionality | expand |
Hi Oleksij, On Sat, Apr 03, 2021 at 01:48:41PM +0200, Oleksij Rempel wrote: > The ar9331 switch is not forwarding IGMP and MLD packets if IGMP > snooping is enabled. This patch is trying to mimic the HW heuristic to take > same decisions as this switch would do to be able to tell the linux > bridge if some packet was prabably forwarded or not. > > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> > --- I am not familiar with IGMP/MLD, therefore I don't really understand what problem you are trying to solve. Your switch has packet traps for IGMP and MLD, ok. So it doesn't forward them. Must the IGMP/MLD packets be forwarded by an IGMP/MLD snooping bridge? Which ones and under what circumstances?
On Sat, Apr 03, 2021 at 04:03:18PM +0300, Vladimir Oltean wrote: > Hi Oleksij, > > On Sat, Apr 03, 2021 at 01:48:41PM +0200, Oleksij Rempel wrote: > > The ar9331 switch is not forwarding IGMP and MLD packets if IGMP > > snooping is enabled. This patch is trying to mimic the HW heuristic to take > > same decisions as this switch would do to be able to tell the linux > > bridge if some packet was prabably forwarded or not. > > > > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> > > --- > > I am not familiar with IGMP/MLD, therefore I don't really understand > what problem you are trying to solve. > > Your switch has packet traps for IGMP and MLD, ok. So it doesn't forward > them. Must the IGMP/MLD packets be forwarded by an IGMP/MLD snooping > bridge? Which ones and under what circumstances? I'll better refer to the rfc: https://tools.ietf.org/html/rfc4541 Regards, Oleksij
On Sat, Apr 03, 2021 at 03:26:36PM +0200, Oleksij Rempel wrote: > On Sat, Apr 03, 2021 at 04:03:18PM +0300, Vladimir Oltean wrote: > > Hi Oleksij, > > > > On Sat, Apr 03, 2021 at 01:48:41PM +0200, Oleksij Rempel wrote: > > > The ar9331 switch is not forwarding IGMP and MLD packets if IGMP > > > snooping is enabled. This patch is trying to mimic the HW heuristic to take > > > same decisions as this switch would do to be able to tell the linux > > > bridge if some packet was prabably forwarded or not. > > > > > > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> > > > --- > > > > I am not familiar with IGMP/MLD, therefore I don't really understand > > what problem you are trying to solve. > > > > Your switch has packet traps for IGMP and MLD, ok. So it doesn't forward > > them. Must the IGMP/MLD packets be forwarded by an IGMP/MLD snooping > > bridge? Which ones and under what circumstances? > > I'll better refer to the rfc: > https://tools.ietf.org/html/rfc4541 Ok, the question might have been a little bit dumb. I found this PDF: https://www.alliedtelesis.com/sites/default/files/documents/how-alliedware/howto_config_igmp1.pdf and it explains that: - a snooper floods the Membership Query messages from the network's querier towards all ports that are not blocked by STP - a snooper forwards all Membership Report messages from a client towards the All Groups port (which is how it reaches the querier). I'm asking this because I just want to understand what the bridge code does. Does the code path for IGMP_HOST_MEMBERSHIP_REPORT (for example) for a snooper go through should_deliver -> nbp_switchdev_allowed_egress, which is what you are affecting here?
> @@ -31,6 +96,13 @@ static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb, > __le16 *phdr; > u16 hdr; > > + if (dp->stp_state == BR_STATE_BLOCKING) { > + /* TODO: should we reflect it in the stats? */ > + netdev_warn_once(dev, "%s:%i dropping blocking packet\n", > + __func__, __LINE__); > + return NULL; > + } > + > phdr = skb_push(skb, AR9331_HDR_LEN); > > hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION); Hi Oleksij This change does not seem to fit with what this patch is doing. I also think it is wrong. You still need BPDU to pass through a blocked port, otherwise spanning tree protocol will be unstable. Andrew
On Sat, Apr 03, 2021 at 04:46:06PM +0300, Vladimir Oltean wrote: > On Sat, Apr 03, 2021 at 03:26:36PM +0200, Oleksij Rempel wrote: > > On Sat, Apr 03, 2021 at 04:03:18PM +0300, Vladimir Oltean wrote: > > > Hi Oleksij, > > > > > > On Sat, Apr 03, 2021 at 01:48:41PM +0200, Oleksij Rempel wrote: > > > > The ar9331 switch is not forwarding IGMP and MLD packets if IGMP > > > > snooping is enabled. This patch is trying to mimic the HW heuristic to take > > > > same decisions as this switch would do to be able to tell the linux > > > > bridge if some packet was prabably forwarded or not. > > > > > > > > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> > > > > --- > > > > > > I am not familiar with IGMP/MLD, therefore I don't really understand > > > what problem you are trying to solve. > > > > > > Your switch has packet traps for IGMP and MLD, ok. So it doesn't forward > > > them. Must the IGMP/MLD packets be forwarded by an IGMP/MLD snooping > > > bridge? Which ones and under what circumstances? > > > > I'll better refer to the rfc: > > https://tools.ietf.org/html/rfc4541 > > Ok, the question might have been a little bit dumb. > I found this PDF: > https://www.alliedtelesis.com/sites/default/files/documents/how-alliedware/howto_config_igmp1.pdf > and it explains that: > - a snooper floods the Membership Query messages from the network's > querier towards all ports that are not blocked by STP > - a snooper forwards all Membership Report messages from a client > towards the All Groups port (which is how it reaches the querier). > > I'm asking this because I just want to understand what the bridge code > does. Does the code path for IGMP_HOST_MEMBERSHIP_REPORT (for example) > for a snooper go through should_deliver -> nbp_switchdev_allowed_egress, > which is what you are affecting here? yes. the exact path should depend on this configuration option: /sys/devices/virtual/net/test/bridge/multicast_snooping I assume, some optimization can be done by letting DSA know the state of multicast_snooping. Off-topic question, this patch set stops to work after rebasing against latest netdev. I get following warning: ip l s lan0 master test RTNETLINK answers: Invalid argumen Are there some API changes? Regards, Oleksij
On Sat, Apr 03, 2021 at 05:22:24PM +0200, Oleksij Rempel wrote: > Off-topic question, this patch set stops to work after rebasing against > latest netdev. I get following warning: > ip l s lan0 master test > RTNETLINK answers: Invalid argumen > > Are there some API changes? Yes, it's likely that you are returning -EINVAL to some of the functions with which DSA calls you at .port_bridge_join time, see dsa_port_switchdev_sync.
Am 03.04.21 um 16:49 schrieb Andrew Lunn: >> @@ -31,6 +96,13 @@ static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb, >> __le16 *phdr; >> u16 hdr; >> >> + if (dp->stp_state == BR_STATE_BLOCKING) { >> + /* TODO: should we reflect it in the stats? */ >> + netdev_warn_once(dev, "%s:%i dropping blocking packet\n", >> + __func__, __LINE__); >> + return NULL; >> + } >> + >> phdr = skb_push(skb, AR9331_HDR_LEN); >> >> hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION); > > Hi Oleksij > > This change does not seem to fit with what this patch is doing. done > I also think it is wrong. You still need BPDU to pass through a > blocked port, otherwise spanning tree protocol will be unstable. We need a better filter, otherwise, in case of software based STP, we are leaking packages on blocked port. For example DHCP do trigger lots of spam in the kernel log. I'll drop STP patch for now, it will be better to make a generic soft STP for all switches without HW offloading. For example ksz9477 is doing SW based STP in similar way. -- Regards, Oleksij
On Sat, Apr 03, 2021 at 07:14:56PM +0200, Oleksij Rempel wrote: > Am 03.04.21 um 16:49 schrieb Andrew Lunn: > >> @@ -31,6 +96,13 @@ static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb, > >> __le16 *phdr; > >> u16 hdr; > >> > >> + if (dp->stp_state == BR_STATE_BLOCKING) { > >> + /* TODO: should we reflect it in the stats? */ > >> + netdev_warn_once(dev, "%s:%i dropping blocking packet\n", > >> + __func__, __LINE__); > >> + return NULL; > >> + } > >> + > >> phdr = skb_push(skb, AR9331_HDR_LEN); > >> > >> hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION); > > > > Hi Oleksij > > > > This change does not seem to fit with what this patch is doing. > > done > > > I also think it is wrong. You still need BPDU to pass through a > > blocked port, otherwise spanning tree protocol will be unstable. > > We need a better filter, otherwise, in case of software based STP, we are leaking packages on > blocked port. For example DHCP do trigger lots of spam in the kernel log. I have no idea whatsoever what 'software based STP' is, if you have hardware-accelerated forwarding. > I'll drop STP patch for now, it will be better to make a generic soft STP for all switches without > HW offloading. For example ksz9477 is doing SW based STP in similar way. How about we discuss first about what your switch is not doing properly? Have you debugged more than just watching the bridge change port states? As Andrew said, a port needs to accept and send link-local frames regardless of the STP state. In the BLOCKING state it must send no other frames and have address learning disabled. Is this what's happening, is the switch forwarding frames towards a BLOCKING port?
Am 04.04.21 um 02:02 schrieb Vladimir Oltean: > On Sat, Apr 03, 2021 at 07:14:56PM +0200, Oleksij Rempel wrote: >> Am 03.04.21 um 16:49 schrieb Andrew Lunn: >>>> @@ -31,6 +96,13 @@ static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb, >>>> __le16 *phdr; >>>> u16 hdr; >>>> >>>> + if (dp->stp_state == BR_STATE_BLOCKING) { >>>> + /* TODO: should we reflect it in the stats? */ >>>> + netdev_warn_once(dev, "%s:%i dropping blocking packet\n", >>>> + __func__, __LINE__); >>>> + return NULL; >>>> + } >>>> + >>>> phdr = skb_push(skb, AR9331_HDR_LEN); >>>> >>>> hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION); >>> >>> Hi Oleksij >>> >>> This change does not seem to fit with what this patch is doing. >> >> done >> >>> I also think it is wrong. You still need BPDU to pass through a >>> blocked port, otherwise spanning tree protocol will be unstable. >> >> We need a better filter, otherwise, in case of software based STP, we are leaking packages on >> blocked port. For example DHCP do trigger lots of spam in the kernel log. > > I have no idea whatsoever what 'software based STP' is, if you have > hardware-accelerated forwarding. I do not mean hardware-accelerated forwarding, i mean hardware-accelerated STP port state helpers. >> I'll drop STP patch for now, it will be better to make a generic soft STP for all switches without >> HW offloading. For example ksz9477 is doing SW based STP in similar way. > > How about we discuss first about what your switch is not doing properly? > Have you debugged more than just watching the bridge change port states? > As Andrew said, a port needs to accept and send link-local frames > regardless of the STP state. In the BLOCKING state it must send no other > frames and have address learning disabled. Is this what's happening, is > the switch forwarding frames towards a BLOCKING port? The switch is not forwarding BPDU frame to the CPU port. So, the linux bridge will stack by cycling different state of the port where loop was detected. -- Regards, Oleksij
On Sun, Apr 04, 2021 at 07:35:26AM +0200, Oleksij Rempel wrote: > Am 04.04.21 um 02:02 schrieb Vladimir Oltean: > > On Sat, Apr 03, 2021 at 07:14:56PM +0200, Oleksij Rempel wrote: > >> Am 03.04.21 um 16:49 schrieb Andrew Lunn: > >>>> @@ -31,6 +96,13 @@ static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb, > >>>> __le16 *phdr; > >>>> u16 hdr; > >>>> > >>>> + if (dp->stp_state == BR_STATE_BLOCKING) { > >>>> + /* TODO: should we reflect it in the stats? */ > >>>> + netdev_warn_once(dev, "%s:%i dropping blocking packet\n", > >>>> + __func__, __LINE__); > >>>> + return NULL; > >>>> + } > >>>> + > >>>> phdr = skb_push(skb, AR9331_HDR_LEN); > >>>> > >>>> hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION); > >>> > >>> Hi Oleksij > >>> > >>> This change does not seem to fit with what this patch is doing. > >> > >> done > >> > >>> I also think it is wrong. You still need BPDU to pass through a > >>> blocked port, otherwise spanning tree protocol will be unstable. > >> > >> We need a better filter, otherwise, in case of software based STP, we are leaking packages on > >> blocked port. For example DHCP do trigger lots of spam in the kernel log. > > > > I have no idea whatsoever what 'software based STP' is, if you have > > hardware-accelerated forwarding. > > I do not mean hardware-accelerated forwarding, i mean > hardware-accelerated STP port state helpers. Still no clue what you mean, sorry. > >> I'll drop STP patch for now, it will be better to make a generic soft STP for all switches without > >> HW offloading. For example ksz9477 is doing SW based STP in similar way. > > > > How about we discuss first about what your switch is not doing properly? > > Have you debugged more than just watching the bridge change port states? > > As Andrew said, a port needs to accept and send link-local frames > > regardless of the STP state. In the BLOCKING state it must send no other > > frames and have address learning disabled. Is this what's happening, is > > the switch forwarding frames towards a BLOCKING port? > > The switch is not forwarding BPDU frame to the CPU port. So, the linux > bridge will stack by cycling different state of the port where loop was > detected. The switch should not be 'forwarding' BPDU frames to the CPU port, it should be 'trapping' them. The difference is subtle but important. Often times switches have an Access Control List which allows them to steal packets from the normal FDB-based forwarding path. It is probably the case that your switch needs to be told to treat STP BPDUs specially and not just 'forward' them. To confirm whether I'm right or wrong, if you disable STP and send a packet with MAC DA 01:80:c2:00:00:00 to the switch, will it flood it towards all ports or will it only send them to the CPU?
diff --git a/net/dsa/tag_ar9331.c b/net/dsa/tag_ar9331.c index 002cf7f952e2..0ba90e0f91bb 100644 --- a/net/dsa/tag_ar9331.c +++ b/net/dsa/tag_ar9331.c @@ -6,6 +6,8 @@ #include <linux/bitfield.h> #include <linux/etherdevice.h> +#include <linux/igmp.h> +#include <net/addrconf.h> #include "dsa_priv.h" @@ -24,6 +26,69 @@ #define AR9331_HDR_RESERVED_MASK GENMASK(5, 4) #define AR9331_HDR_PORT_NUM_MASK GENMASK(3, 0) +/* + * This code replicated MLD detection more or less in the same way as the + * switch is doing it + */ +static int ipv6_mc_check_ip6hdr(struct sk_buff *skb) +{ + const struct ipv6hdr *ip6h; + unsigned int offset; + + offset = skb_network_offset(skb) + sizeof(*ip6h); + + if (!pskb_may_pull(skb, offset)) + return -EINVAL; + + ip6h = ipv6_hdr(skb); + + if (ip6h->version != 6) + return -EINVAL; + + skb_set_transport_header(skb, offset); + + return 0; +} + +static int ipv6_mc_check_exthdrs(struct sk_buff *skb) +{ + const struct ipv6hdr *ip6h; + int offset; + u8 nexthdr; + __be16 frag_off; + + ip6h = ipv6_hdr(skb); + + if (ip6h->nexthdr != IPPROTO_HOPOPTS) + return -ENOMSG; + + nexthdr = ip6h->nexthdr; + offset = skb_network_offset(skb) + sizeof(*ip6h); + offset = ipv6_skip_exthdr(skb, offset, &nexthdr, &frag_off); + + if (offset < 0) + return -EINVAL; + + if (nexthdr != IPPROTO_ICMPV6) + return -ENOMSG; + + skb_set_transport_header(skb, offset); + + return 0; +} + +static int my_ipv6_mc_check_mld(struct sk_buff *skb) +{ + int ret; + + ret = ipv6_mc_check_ip6hdr(skb); + if (ret < 0) + return ret; + + return ipv6_mc_check_exthdrs(skb); +} + + static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb, struct net_device *dev) { @@ -31,6 +96,13 @@ static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb, __le16 *phdr; u16 hdr; + if (dp->stp_state == BR_STATE_BLOCKING) { + /* TODO: should we reflect it in the stats? */ + netdev_warn_once(dev, "%s:%i dropping blocking packet\n", + __func__, __LINE__); + return NULL; + } + phdr = skb_push(skb, AR9331_HDR_LEN); hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION); @@ -80,11 +152,69 @@ static struct sk_buff *ar9331_tag_rcv(struct sk_buff *skb, return skb; } +static void ar9331_tag_rcv_post(struct sk_buff *skb) +{ + const struct iphdr *iph; + unsigned char *dest; + int ret; + + /* + * Since the switch do not tell us which packets was offloaded we assume + * that all of them did. Except: + * - port is not configured for forwarding to any other ports + * - igmp/mld snooping is enabled + * - unicast or multicast flood is disabled on some of bridged ports + * - if we have two port bridge and one is not in forwarding state. + * - packet was dropped on the output port.. + * - any other reasons? + */ + skb->offload_fwd_mark = true; + + dest = eth_hdr(skb)->h_dest; + /* + * Complete not multicast traffic seems to be forwarded automatically, + * as long as multicast and unicast flood are enabled + */ + if (!(is_multicast_ether_addr(dest) && !is_broadcast_ether_addr(dest))) + return; + + + /* + * Documentation do not providing any usable information on how the + * igmp/mld snooping is implemented on this switch. Following + * implementation is based on testing, by sending correct and malformed + * packets to the switch. + * It is not trying to find sane and properly formated packets. Instead + * it is trying to be as close to the switch behavior as possible. + */ + skb_reset_network_header(skb); + switch (ntohs(skb->protocol)) { + case ETH_P_IP: + + if (!pskb_network_may_pull(skb, sizeof(*iph))) + break; + + iph = ip_hdr(skb); + if (iph->protocol == IPPROTO_IGMP) + skb->offload_fwd_mark = false; + + break; + case ETH_P_IPV6: + ret = my_ipv6_mc_check_mld(skb); + if (!ret) + skb->offload_fwd_mark = false; + + break; + } +} + + static const struct dsa_device_ops ar9331_netdev_ops = { .name = "ar9331", .proto = DSA_TAG_PROTO_AR9331, .xmit = ar9331_tag_xmit, .rcv = ar9331_tag_rcv, + .rcv_post = ar9331_tag_rcv_post, .overhead = AR9331_HDR_LEN, };
The ar9331 switch is not forwarding IGMP and MLD packets if IGMP snooping is enabled. This patch is trying to mimic the HW heuristic to take same decisions as this switch would do to be able to tell the linux bridge if some packet was prabably forwarded or not. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> --- net/dsa/tag_ar9331.c | 130 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 130 insertions(+)