Message ID | 20201110153958.ci5ekor3o2ekg3ky@ipetronik.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net: lan78xx: Disable hardware vlan filtering in promiscuous mode | expand |
Context | Check | Description |
---|---|---|
netdev/apply | fail | Patch does not apply to net |
netdev/tree_selection | success | Clearly marked for net |
On Tue, 10 Nov 2020 16:39:58 +0100 Markus Blöchl wrote: > The rx-vlan-filter feature flag prevents unexpected tagged frames on > the wire from reaching the kernel in promiscuous mode. > Disable this offloading feature in the lan7800 controller whenever > IFF_PROMISC is set and make sure that the hardware features > are updated when IFF_PROMISC changes. > > Signed-off-by: Markus Blöchl <markus.bloechl@ipetronik.com> Doesn't apply to neither net or net-next, please respin. Could you also add a Fixes tag since this is a fix? Thanks!
On Tue, 10 Nov 2020 16:39:58 +0100 Markus Blöchl wrote: > The rx-vlan-filter feature flag prevents unexpected tagged frames on > the wire from reaching the kernel in promiscuous mode. > Disable this offloading feature in the lan7800 controller whenever > IFF_PROMISC is set and make sure that the hardware features > are updated when IFF_PROMISC changes. > > Signed-off-by: Markus Blöchl <markus.bloechl@ipetronik.com> > --- > > Notes: > When sniffing ethernet using a LAN7800 ethernet controller, vlan-tagged > frames are silently dropped by the controller due to the > RFE_CTL_VLAN_FILTER flag being set by default since commit > 4a27327b156e("net: lan78xx: Add support for VLAN filtering."). > > In order to receive those tagged frames it is necessary to manually disable > rx vlan filtering using ethtool ( `ethtool -K ethX rx-vlan-filter off` or > corresponding ioctls ). Setting all bits in the vlan filter table to 1 is > an even worse approach, imho. > > As a user I would probably expect that setting IFF_PROMISC should be enough > in all cases to receive all valid traffic. > Therefore I think this behaviour is a bug in the driver, since other > drivers (notably ixgbe) automatically disable rx-vlan-filter when > IFF_PROMISC is set. Please correct me if I am wrong here. I've been mulling over this, I'm not 100% sure that disabling VLAN filters on promisc is indeed the right thing to do. The ixgbe doing this is somewhat convincing. OTOH users would not expect flow filters to get disabled when promisc is on, so why disable vlan filters? Either way we should either document this as an expected behavior or make the core clear the features automatically rather than force drivers to worry about it. Does anyone else have an opinion, please? > diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c > index 65b315bc60ab..ac6c0beeac21 100644 > --- a/drivers/net/usb/lan78xx.c > +++ b/drivers/net/usb/lan78xx.c > @@ -2324,13 +2324,15 @@ static int lan78xx_set_mac_addr(struct net_device *netdev, void *p) > } > > /* Enable or disable Rx checksum offload engine */ > -static int lan78xx_set_features(struct net_device *netdev, > - netdev_features_t features) > +static void lan78xx_update_features(struct net_device *netdev, > + netdev_features_t features) > { > struct lan78xx_net *dev = netdev_priv(netdev); > struct lan78xx_priv *pdata = (struct lan78xx_priv *)(dev->data[0]); > unsigned long flags; > - int ret; > + > + if (netdev->flags & IFF_PROMISC) > + features &= ~NETIF_F_HW_VLAN_CTAG_FILTER; > > spin_lock_irqsave(&pdata->rfe_ctl_lock, flags); > > @@ -2353,12 +2355,30 @@ static int lan78xx_set_features(struct net_device *netdev, > pdata->rfe_ctl &= ~RFE_CTL_VLAN_FILTER_; > > spin_unlock_irqrestore(&pdata->rfe_ctl_lock, flags); > +} > + > +static int lan78xx_set_features(struct net_device *netdev, > + netdev_features_t features) > +{ > + struct lan78xx_net *dev = netdev_priv(netdev); > + struct lan78xx_priv *pdata = (struct lan78xx_priv *)(dev->data[0]); > + int ret; > + > + lan78xx_update_features(netdev, features); > > ret = lan78xx_write_reg(dev, RFE_CTL, pdata->rfe_ctl); > > return 0; > } > > +static void lan78xx_set_rx_mode(struct net_device *netdev) > +{ > + /* Enable or disable checksum offload engines */ > + lan78xx_update_features(netdev, netdev->features); > + > + lan78xx_set_multicast(netdev); > +} > + > static void lan78xx_deferred_vlan_write(struct work_struct *param) > { > struct lan78xx_priv *pdata = > @@ -2528,10 +2548,7 @@ static int lan78xx_reset(struct lan78xx_net *dev) > pdata->rfe_ctl |= RFE_CTL_BCAST_EN_ | RFE_CTL_DA_PERFECT_; > ret = lan78xx_write_reg(dev, RFE_CTL, pdata->rfe_ctl); > > - /* Enable or disable checksum offload engines */ > - lan78xx_set_features(dev->net, dev->net->features); > - > - lan78xx_set_multicast(dev->net); > + lan78xx_set_rx_mode(dev->net); > > /* reset PHY */ > ret = lan78xx_read_reg(dev, PMT_CTL, &buf); > @@ -3613,7 +3630,7 @@ static const struct net_device_ops lan78xx_netdev_ops = { > .ndo_set_mac_address = lan78xx_set_mac_addr, > .ndo_validate_addr = eth_validate_addr, > .ndo_do_ioctl = phy_do_ioctl_running, > - .ndo_set_rx_mode = lan78xx_set_multicast, > + .ndo_set_rx_mode = lan78xx_set_rx_mode, > .ndo_set_features = lan78xx_set_features, > .ndo_vlan_rx_add_vid = lan78xx_vlan_rx_add_vid, > .ndo_vlan_rx_kill_vid = lan78xx_vlan_rx_kill_vid, > > base-commit: 4e0396c59559264442963b349ab71f66e471f84d > -- > 2.29.2 > > > Impressum/Imprint: https://www.ipetronik.com/impressum
On 11/11/2020 7:43 AM, Jakub Kicinski wrote: > On Tue, 10 Nov 2020 16:39:58 +0100 Markus Blöchl wrote: >> The rx-vlan-filter feature flag prevents unexpected tagged frames on >> the wire from reaching the kernel in promiscuous mode. >> Disable this offloading feature in the lan7800 controller whenever >> IFF_PROMISC is set and make sure that the hardware features >> are updated when IFF_PROMISC changes. >> >> Signed-off-by: Markus Blöchl <markus.bloechl@ipetronik.com> >> --- >> >> Notes: >> When sniffing ethernet using a LAN7800 ethernet controller, vlan-tagged >> frames are silently dropped by the controller due to the >> RFE_CTL_VLAN_FILTER flag being set by default since commit >> 4a27327b156e("net: lan78xx: Add support for VLAN filtering."). >> >> In order to receive those tagged frames it is necessary to manually disable >> rx vlan filtering using ethtool ( `ethtool -K ethX rx-vlan-filter off` or >> corresponding ioctls ). Setting all bits in the vlan filter table to 1 is >> an even worse approach, imho. >> >> As a user I would probably expect that setting IFF_PROMISC should be enough >> in all cases to receive all valid traffic. >> Therefore I think this behaviour is a bug in the driver, since other >> drivers (notably ixgbe) automatically disable rx-vlan-filter when >> IFF_PROMISC is set. Please correct me if I am wrong here. > > I've been mulling over this, I'm not 100% sure that disabling VLAN > filters on promisc is indeed the right thing to do. The ixgbe doing > this is somewhat convincing. OTOH users would not expect flow filters > to get disabled when promisc is on, so why disable vlan filters? > > Either way we should either document this as an expected behavior or > make the core clear the features automatically rather than force > drivers to worry about it. > > Does anyone else have an opinion, please? The semantics of promiscuous are pretty clear though, and if you have a NIC with VLAN filtering capability which could prevent the stack from seeing *all* packets, that would be considered a bug. I suppose that you could not disable VLAN filtering but instead install all 4096 - N VLANs (N being currently used) into the filter to guarantee receiving those VLAN tagged frames? As far as flow filters, this is actually a good question, it sounds like there are some possibly interesting problems to solve there. For instance with an Ethernet switch, if you had a rule that diverted packets to be switched directly to a particular port, what should happen when either of these ports is in promiscuous mode? Should the switch be instructed to replace all of the rules to forward + copy to the CPU?
On Wed, Nov 11, 2020 at 07:56:58AM -0800, Florian Fainelli wrote: > The semantics of promiscuous are pretty clear though, and if you have a > NIC with VLAN filtering capability which could prevent the stack from > seeing *all* packets, that would be considered a bug. I suppose that you > could not disable VLAN filtering but instead install all 4096 - N VLANs > (N being currently used) into the filter to guarantee receiving those > VLAN tagged frames? Are they? IEEE 802.3 clause 30.3.1.1.16 aPromiscuousStatus says: APPROPRIATE SYNTAX: BOOLEAN BEHAVIOUR DEFINED AS: A GET operation returns the value “true” for promiscuous mode enabled, and “false” otherwise. Frames without errors received solely because this attribute has the value “true” are counted as frames received correctly; frames received in this mode that do contain errors update the appropriate error counters. A SET operation to the value “true” provides a means to cause the LayerMgmtRecognizeAddress function to accept frames regardless of their destination address. A SET operation to the value “false” causes the MAC sublayer to return to the normal operation of carrying out address recognition procedures for station, broadcast, and multicast group addresses (LayerMgmtRecognizeAddress function).; As for IEEE 802.1Q, there's nothing about promiscuity in the context of VLAN there. Sadly, I think promiscuity refers only to address recognition for the purpose of packet termination. I cannot find any reference to VLAN in the context of promiscuity, or, for that matter, I cannot find any reference coming from a standards body that promiscuity would mean "accept all packets".
On Wed, Nov 11, 2020 at 06:47:27PM +0200, Vladimir Oltean wrote: > On Wed, Nov 11, 2020 at 07:56:58AM -0800, Florian Fainelli wrote: > > The semantics of promiscuous are pretty clear though, and if you have a > > NIC with VLAN filtering capability which could prevent the stack from > > seeing *all* packets, that would be considered a bug. I suppose that you > > could not disable VLAN filtering but instead install all 4096 - N VLANs > > (N being currently used) into the filter to guarantee receiving those > > VLAN tagged frames? > > Are they? > > IEEE 802.3 clause 30.3.1.1.16 aPromiscuousStatus says: > > APPROPRIATE SYNTAX: > BOOLEAN > > BEHAVIOUR DEFINED AS: > A GET operation returns the value “true” for promiscuous mode enabled, and “false” otherwise. > > Frames without errors received solely because this attribute has the value “true” are counted as > frames received correctly; frames received in this mode that do contain errors update the > appropriate error counters. > > A SET operation to the value “true” provides a means to cause the LayerMgmtRecognizeAddress > function to accept frames regardless of their destination address. > > A SET operation to the value “false” causes the MAC sublayer to return to the normal operation > of carrying out address recognition procedures for station, broadcast, and multicast group > addresses (LayerMgmtRecognizeAddress function).; > > > As for IEEE 802.1Q, there's nothing about promiscuity in the context of > VLAN there. > > Sadly, I think promiscuity refers only to address recognition for the > purpose of packet termination. I cannot find any reference to VLAN in > the context of promiscuity, or, for that matter, I cannot find any > reference coming from a standards body that promiscuity would mean > "accept all packets". I realize I did not tell you what the LayerMgmtRecognizeAddress function does. function LayerMgmtRecognizeAddress(address: AddressValue): Boolean; begin if {promiscuous receive enabled} then LayerMgmtRecognizeAddress := true; if address = ... {MAC station address} then LayerMgmtRecognizeAddress := true; if address = ... {Broadcast address} then LayerMgmtRecognizeAddress := true; if address = ... {One of the addresses on the multicast list and multicast reception is enabled} then LayerMgmtRecognizeAddress := true; LayerMgmtRecognizeAddress := false end; {LayerMgmtRecognizeAddress} Markus complained about the tcpdump program in particular. Well, tcpdump is a complex beast, and far too often, people seem to conflate tcpdump with promiscuity, even though: - promiscuity is not what enables tcpdump to see "all packets" being sent/received by the network stack on that interface, but ETH_P_ALL sockets are what do the magic there - tcpdump also has a --no-promiscuous-mode option. I would expect that tcpdump could gain a feature to disable (even if temporarily) the rx-vlan-filter offload, through an ethtool netlink message. Then users could get what they expect.
On Wed, Nov 11, 2020 at 07:43:41AM -0800, Jakub Kicinski wrote: > On Tue, 10 Nov 2020 16:39:58 +0100 Markus Blöchl wrote: > > The rx-vlan-filter feature flag prevents unexpected tagged frames on > > the wire from reaching the kernel in promiscuous mode. > > Disable this offloading feature in the lan7800 controller whenever > > IFF_PROMISC is set and make sure that the hardware features > > are updated when IFF_PROMISC changes. > > > > Signed-off-by: Markus Blöchl <markus.bloechl@ipetronik.com> > > --- > > > > Notes: > > When sniffing ethernet using a LAN7800 ethernet controller, vlan-tagged > > frames are silently dropped by the controller due to the > > RFE_CTL_VLAN_FILTER flag being set by default since commit > > 4a27327b156e("net: lan78xx: Add support for VLAN filtering."). > > > > In order to receive those tagged frames it is necessary to manually disable > > rx vlan filtering using ethtool ( `ethtool -K ethX rx-vlan-filter off` or > > corresponding ioctls ). Setting all bits in the vlan filter table to 1 is > > an even worse approach, imho. > > > > As a user I would probably expect that setting IFF_PROMISC should be enough > > in all cases to receive all valid traffic. > > Therefore I think this behaviour is a bug in the driver, since other > > drivers (notably ixgbe) automatically disable rx-vlan-filter when > > IFF_PROMISC is set. Please correct me if I am wrong here. > > I've been mulling over this, I'm not 100% sure that disabling VLAN > filters on promisc is indeed the right thing to do. The ixgbe doing > this is somewhat convincing. OTOH users would not expect flow filters > to get disabled when promisc is on, so why disable vlan filters? > > Either way we should either document this as an expected behavior or > make the core clear the features automatically rather than force > drivers to worry about it. > > Does anyone else have an opinion, please? I agree it is weird, but it seems to be justified by the bridge code. Note that the code was added around 2013, so it is possible it was influenced by the behavior of existing drivers. 1. vi net/bridge/br_vlan.c +245 /* Add VLAN to the device filter if it is supported. * This ensures tagged traffic enters the bridge when * promiscuous mode is disabled by br_manage_promisc(). */ 2. The bridge device itself will not filter packets based on their VID when it is in promiscuous mode: vi net/bridge/br_input.c +45 vg = br_vlan_group_rcu(br); /* Bridge is just like any other port. Make sure the * packet is allowed except in promisc modue when someone * may be running packet capture. */ if (!(brdev->flags & IFF_PROMISC) && !br_allowed_egress(vg, skb)) { kfree_skb(skb); return NET_RX_DROP; }
On Wed, Nov 11, 2020 at 06:47:27PM +0200, Vladimir Oltean wrote: > On Wed, Nov 11, 2020 at 07:56:58AM -0800, Florian Fainelli wrote: > > The semantics of promiscuous are pretty clear though, and if you have a > > NIC with VLAN filtering capability which could prevent the stack from > > seeing *all* packets, that would be considered a bug. I suppose that you > > could not disable VLAN filtering but instead install all 4096 - N VLANs > > (N being currently used) into the filter to guarantee receiving those > > VLAN tagged frames? Adding all VLAN ids to the filter is certainly a possibility, I just don't see why that redundant extra lookup per frame should be done in the NIC. I guess, we could also put that feature on WISH while promisc ist active. That, at least, makes it clear what happened. > > Are they? > > IEEE 802.3 clause 30.3.1.1.16 aPromiscuousStatus says: > > APPROPRIATE SYNTAX: > BOOLEAN > > BEHAVIOUR DEFINED AS: > A GET operation returns the value “true” for promiscuous mode enabled, and “false” otherwise. > > Frames without errors received solely because this attribute has the value “true” are counted as > frames received correctly; frames received in this mode that do contain errors update the > appropriate error counters. > > A SET operation to the value “true” provides a means to cause the LayerMgmtRecognizeAddress > function to accept frames regardless of their destination address. > > A SET operation to the value “false” causes the MAC sublayer to return to the normal operation > of carrying out address recognition procedures for station, broadcast, and multicast group > addresses (LayerMgmtRecognizeAddress function).; > > > As for IEEE 802.1Q, there's nothing about promiscuity in the context of > VLAN there. > > Sadly, I think promiscuity refers only to address recognition for the > purpose of packet termination. I cannot find any reference to VLAN in > the context of promiscuity, or, for that matter, I cannot find any > reference coming from a standards body that promiscuity would mean > "accept all packets". From what I can see, most other drivers use a special hardware register flag to enable promiscuous mode, which overrules all other filters. e.g. from the ASIX AX88178 datasheet: PRO: PACKET_TYPE_PROMISCUOUS. 1: All frames received by the ASIC are forwarded up toward the host. 0: Disabled (default). It is just so that the lan78xx controllers don't have this explicit flag. But since my change is more controversial than I anticipated, I would like to take a step back and ask some general questions first: We used to connect something akin to a port mirror to a lan7800 NIC (and a few others) in order to record and debug some VLAN heavy traffic. On older kernel versions putting the interface into promiscuous mode and opening and binding an AF_PACKET socket (or just throwing tcpdump or libpcap at it) was sufficient. After a kernel upgrade the same setup missed most of the traffic. Does this qualify as a regression? Why not? Should there be a documented and future proof way to receive *all* valid ethernet frames from an interface? This could be something like: a) - Bring up the interface - Put the interface into promiscuous mode - Open, bind and read a raw AF_PACKET socket with ETH_P_ALL - Patch up the 801.1Q headers if required. b) - The same as a) - Additionally enumerate and disable all available offloading features c) - Use libpcap / Do whatever libpcap does (like with TPACKET) In this case you need to help me convince the tcpdump folks that this is a bug on their side... ;-) d) - Read the controller datasheet - Read the kernel documentation - Read your kernels and drivers sources - Do whatever might be necessary e) - No, there is no guaranteed way to to this Any opinions on these questions? After those are answered, I am open to suggestions on how to fix this differently (if still needed). I'd rather not get involved into discussions on flow filters as I am absolutely clueless in this regard. PS: Sorry for sending from my companies mail server. It does some nasty transformations on the body to outgoing mails I just complained about. If this isn't resolved soon, I might follow up using another address which should preserve patches.
On Wed, Nov 11, 2020 at 7:43 AM Jakub Kicinski <kuba@kernel.org> wrote: > > On Tue, 10 Nov 2020 16:39:58 +0100 Markus Blöchl wrote: > > The rx-vlan-filter feature flag prevents unexpected tagged frames on > > the wire from reaching the kernel in promiscuous mode. > > Disable this offloading feature in the lan7800 controller whenever > > IFF_PROMISC is set and make sure that the hardware features > > are updated when IFF_PROMISC changes. > > > > Signed-off-by: Markus Blöchl <markus.bloechl@ipetronik.com> > > --- > > > > Notes: > > When sniffing ethernet using a LAN7800 ethernet controller, vlan-tagged > > frames are silently dropped by the controller due to the > > RFE_CTL_VLAN_FILTER flag being set by default since commit > > 4a27327b156e("net: lan78xx: Add support for VLAN filtering."). > > > > In order to receive those tagged frames it is necessary to manually disable > > rx vlan filtering using ethtool ( `ethtool -K ethX rx-vlan-filter off` or > > corresponding ioctls ). Setting all bits in the vlan filter table to 1 is > > an even worse approach, imho. > > > > As a user I would probably expect that setting IFF_PROMISC should be enough > > in all cases to receive all valid traffic. > > Therefore I think this behaviour is a bug in the driver, since other > > drivers (notably ixgbe) automatically disable rx-vlan-filter when > > IFF_PROMISC is set. Please correct me if I am wrong here. > > I've been mulling over this, I'm not 100% sure that disabling VLAN > filters on promisc is indeed the right thing to do. The ixgbe doing > this is somewhat convincing. OTOH users would not expect flow filters > to get disabled when promisc is on, so why disable vlan filters? > > Either way we should either document this as an expected behavior or > make the core clear the features automatically rather than force > drivers to worry about it. > > Does anyone else have an opinion, please? My personal preference is for the setting of all 4096 VLANs when promiscuous mode is enabled which is what we do with ixgbe if virtualization is enabled. If we want to provide an option to disable Rx VLAN filtering then just do it via the ethtool feature bit. With that things would be at least consistent as I suspect disabling Rx VLAN filtering may also impact Rx VLAN tag stripping so both might need to be updated at the same time. I know in the case of virtualization most NICs have to leave the VLAN filtering enabled for SR-IOV to be able to handle per VF VLAN filters regardless of the PF setting. Anyway that is my $.02. - Alex
On Wed, 11 Nov 2020 07:43:41 -0800 Jakub Kicinski wrote: > > In order to receive those tagged frames it is necessary to manually disable > > rx vlan filtering using ethtool ( `ethtool -K ethX rx-vlan-filter off` or > > corresponding ioctls ). Setting all bits in the vlan filter table to 1 is > > an even worse approach, imho. > > > > As a user I would probably expect that setting IFF_PROMISC should be enough > > in all cases to receive all valid traffic. > > Therefore I think this behaviour is a bug in the driver, since other > > drivers (notably ixgbe) automatically disable rx-vlan-filter when > > IFF_PROMISC is set. Please correct me if I am wrong here. > > I've been mulling over this, I'm not 100% sure that disabling VLAN > filters on promisc is indeed the right thing to do. The ixgbe doing > this is somewhat convincing. OTOH users would not expect flow filters > to get disabled when promisc is on, so why disable vlan filters? > > Either way we should either document this as an expected behavior or > make the core clear the features automatically rather than force > drivers to worry about it. > > Does anyone else have an opinion, please? Okay, I feel convinced that we should indeed let all vlan traffic thru, thanks everyone! :) Markus could you try to come up with a patch for the net/core/dev.c handling which would clear NETIF_F_HW_VLAN_CTAG_FILTER and NETIF_F_HW_VLAN_STAG_FILTER automatically so the drivers don't have to worry? Whatever the driver chooses to do to disable vlan filtering is a separate discussion AFAICT.
On Thu, Nov 12, 2020 at 11:53:15AM +0100, Markus Blöchl wrote: > From what I can see, most other drivers use a special hardware register > flag to enable promiscuous mode, which overrules all other filters. Yes, but it may not mean what you think. > e.g. from the ASIX AX88178 datasheet: > > PRO: PACKET_TYPE_PROMISCUOUS. > 1: All frames received by the ASIC are forwarded up toward the host. > 0: Disabled (default). See, that's one definition of promiscuity that is really strange, and not backed by any standards body that I know of (if you know otherwise, please speak up). > It is just so that the lan78xx controllers don't have this explicit flag. Which is not surprising, at least under that description. Others may be inclined to call that behavior "packet trapping" when applying it to e.g. an Ethernet switch. There might be an entire discussion about how promiscuity does _not_ mean "deliver all packets to the CPU" that might be of interest to you: https://lkml.org/lkml/2019/8/29/255 > But since my change is more controversial than I anticipated, I would like > to take a step back and ask some general questions first: > > We used to connect something akin to a port mirror to a lan7800 NIC > (and a few others) in order to record and debug some VLAN heavy traffic. > On older kernel versions putting the interface into promiscuous mode > and opening and binding an AF_PACKET socket (or just throwing tcpdump > or libpcap at it) was sufficient. > After a kernel upgrade the same setup missed most of the traffic. > Does this qualify as a regression? Why not? If something used to work but no longer does, that's what a regression is. But fixing it depends on whether it should have worked like that in the first place or not. That we don't know. > Should there be a documented and future proof way to receive *all* > valid ethernet frames from an interface? Yes, there should. > This could be something like: > > a) - Bring up the interface > - Put the interface into promiscuous mode This one would be necessary in order to not drop packets with unknown addresses, not more. > - Open, bind and read a raw AF_PACKET socket with ETH_P_ALL > - Patch up the 801.1Q headers if required. What do you mean by "patching up"? Are you talking about the fact that packets are untagged by the stack in the receive path anyway, and the VLAN header would need to be reconstructed? https://patchwork.ozlabs.org/project/netdev/patch/e06dbb47-2d1c-03ca-4cd7-cc958b6a939e@gmail.com/ > > b) - The same as a) > - Additionally enumerate and disable all available offloading features If said offloading features have to do with the CPU not receiving some frames any longer, and you _do_ want the CPU to see them, then obviously said offloading features should be disabled. This includes not only VLAN filtering, but also bridging offload, IP forwarding offload, etc. I'd say that (b) should be sufficient, but not future-proof in the sense that new offloading features might come every day, and they would need to be disabled on a case-by-case basis. For the forwarding offload, there would be no question whatsoever that you'd need to turn it off, or add a mirroring rule towards the CPU, or do _something_ user-visible, to get that traffic. But as for VLAN filtering offload in particular, there's the (not negligible at all) precedent created by the bridge driver, that Ido pointed out. That would be a decision for the maintainers to make, if the Linux network stack creates its own definition of promiscuity which openly contradicts IEEE's. One might perhaps try to argue that the VLAN ID is an integral part of the station's address (which is true if you look at it from the perspective of an IEEE 802.1Q bridge), and therefore promiscuity should apply on the VLAN ID too, not just the MAC address. Either way, that should be made more clear than it currently is, once a decision is taken. > c) - Use libpcap / Do whatever libpcap does (like with TPACKET) > In this case you need to help me convince the tcpdump folks that this > is a bug on their side... ;-) Well, that assumes that today, tcpdump can always capture all traffic on all types of interfaces, something which is already false (see bridging offload / IP forwarding offload). There, it was even argued that you'd better be 100% sure that you want to see all received traffic, as the interfaces can be very high-speed, and not even a mirroring rule might guarantee reception of 100% of the traffic. That's why things like sFlow / tc-sample exist. > d) - Read the controller datasheet > - Read the kernel documentation > - Read your kernels and drivers sources > - Do whatever might be necessary Yes, in general reading is helpful, but I'm not quite sure where you're going with this? > e) - No, there is no guaranteed way to to this No, there should be a way to achieve that through some sort of configuration. > Any opinions on these questions? My 2 cents have just been shared. > After those are answered, I am open to suggestions on how to fix this > differently (if still needed). Turn off VLAN filtering, or get a commonly accepted definition of promiscuity?
On Sat, Nov 14, 2020 at 08:11:03PM +0200, Vladimir Oltean wrote: > On Thu, Nov 12, 2020 at 11:53:15AM +0100, Markus Blöchl wrote: > > From what I can see, most other drivers use a special hardware register > > flag to enable promiscuous mode, which overrules all other filters. > > Yes, but it may not mean what you think. > > > e.g. from the ASIX AX88178 datasheet: > > > > PRO: PACKET_TYPE_PROMISCUOUS. > > 1: All frames received by the ASIC are forwarded up toward the host. > > 0: Disabled (default). > > See, that's one definition of promiscuity that is really strange, and > not backed by any standards body that I know of (if you know otherwise, > please speak up). > > > It is just so that the lan78xx controllers don't have this explicit flag. > > Which is not surprising, at least under that description. Others may be > inclined to call that behavior "packet trapping" when applying it to > e.g. an Ethernet switch. > > There might be an entire discussion about how promiscuity does _not_ > mean "deliver all packets to the CPU" that might be of interest to you: > https://lkml.org/lkml/2019/8/29/255 If I glanced over this discussion correctly, it is about avoiding promiscuity under certain circumstances (while promiscuity was only requested for switching and not by userspace) because HW promiscuity is not needed in that particular case. As I currently see it, there are two common use cases for promiscuity: 1) Applying filtering, switching and other logic on the CPU. This could be due to limited resources in the NIC, e.g. when there are too many unicast addresses configured on that interface or simply an unavailable hardware capability. 2) Sniffing the wire. The kernel uses IFF_PROMISC (or `__dev_set_promiscuity()`) for both, which obviously can be overkill in the first case. The question remains, what does IFF_PROMISC exactly mean for userspace (which, I guess, most often uses it for 2)? > > > But since my change is more controversial than I anticipated, I would like > > to take a step back and ask some general questions first: > > > > We used to connect something akin to a port mirror to a lan7800 NIC > > (and a few others) in order to record and debug some VLAN heavy traffic. > > On older kernel versions putting the interface into promiscuous mode > > and opening and binding an AF_PACKET socket (or just throwing tcpdump > > or libpcap at it) was sufficient. > > After a kernel upgrade the same setup missed most of the traffic. > > Does this qualify as a regression? Why not? > > If something used to work but no longer does, that's what a regression > is. But fixing it depends on whether it should have worked like that in > the first place or not. That we don't know. Admittedly, I certainly don't know, but hoped to find whom who does on this list. ;-) > > > Should there be a documented and future proof way to receive *all* > > valid ethernet frames from an interface? > > Yes, there should. > > > This could be something like: > > > > a) - Bring up the interface > > - Put the interface into promiscuous mode > > This one would be necessary in order to not drop packets with unknown > addresses, not more. > > > - Open, bind and read a raw AF_PACKET socket with ETH_P_ALL > > - Patch up the 801.1Q headers if required. > > What do you mean by "patching up"? Are you talking about the fact that > packets are untagged by the stack in the receive path anyway, and the > VLAN header would need to be reconstructed? > https://patchwork.ozlabs.org/project/netdev/patch/e06dbb47-2d1c-03ca-4cd7-cc958b6a939e@gmail.com/ Yes, that's exactly what I was referring to. I find it slightly annoying on RAW sockets, but it's documented and (hopefully) consistent behaviour now, so okay for me. > > > > > b) - The same as a) > > - Additionally enumerate and disable all available offloading features > > If said offloading features have to do with the CPU not receiving some > frames any longer, and you _do_ want the CPU to see them, then obviously > said offloading features should be disabled. This includes not only VLAN > filtering, but also bridging offload, IP forwarding offload, etc. > > I'd say that (b) should be sufficient, but not future-proof in the sense > that new offloading features might come every day, and they would need > to be disabled on a case-by-case basis. > > For the forwarding offload, there would be no question whatsoever that > you'd need to turn it off, or add a mirroring rule towards the CPU, or > do _something_ user-visible, to get that traffic. But as for VLAN > filtering offload in particular, there's the (not negligible at all) > precedent created by the bridge driver, that Ido pointed out. That would > be a decision for the maintainers to make, if the Linux network stack > creates its own definition of promiscuity which openly contradicts IEEE's. > One might perhaps try to argue that the VLAN ID is an integral part of > the station's address (which is true if you look at it from the > perspective of an IEEE 802.1Q bridge), and therefore promiscuity should > apply on the VLAN ID too, not just the MAC address. Either way, that > should be made more clear than it currently is, once a decision is > taken. In that case, maybe new features which could alter user-visible behaviour should not be enabled by default? If I am not mistaken, bridging offload, IP forwarding offload and similar have to be enabled explicitly, at least. I am not convinced that this definition would indeed contradict the IEEE standard. It might just be a stronger one with more guarantees. Assuming you have a very stupid NIC without any filtering or offloading capabilities, which would always forward all frames to the CPU. Would this NIC not comply to the standard? @Jakub May I take your answer as a final decision or would you prefer some more input on this? > > > c) - Use libpcap / Do whatever libpcap does (like with TPACKET) > > In this case you need to help me convince the tcpdump folks that this > > is a bug on their side... ;-) > > Well, that assumes that today, tcpdump can always capture all traffic on > all types of interfaces, something which is already false (see bridging > offload / IP forwarding offload). There, it was even argued that you'd > better be 100% sure that you want to see all received traffic, as the > interfaces can be very high-speed, and not even a mirroring rule might > guarantee reception of 100% of the traffic. That's why things like sFlow > / tc-sample exist. > > > d) - Read the controller datasheet > > - Read the kernel documentation > > - Read your kernels and drivers sources > > - Do whatever might be necessary > > Yes, in general reading is helpful, but I'm not quite sure where you're > going with this? Well, that just meant, that there should always be a way, but no universal one. Which one depends on your exact hardware setup or maybe even the current constellation of stars... > > > e) - No, there is no guaranteed way to to this > > No, there should be a way to achieve that through some sort of > configuration. > > > Any opinions on these questions? > > My 2 cents have just been shared. > > > After those are answered, I am open to suggestions on how to fix this > > differently (if still needed). > > Turn off VLAN filtering, or get a commonly accepted definition of > promiscuity? Other Drivers ============= So I tried to figure out what other existing hardware drivers do by grepping for drivers which do something on a change to NETIF_F_HW_VLAN_CTAG_FILTER. Here are the results of me trying to understand the drivers quickly. I hope it's somewhat close and helps: 1) aqc111 This controller has a HW register flag for promiscuous mode. I am not sure what it does, but since this is another ASIX device, the documentation from above should apply. 2) lan78xx Here it began ... 3) ixgbe This driver disables HW_VLAN_CTAG_FILTER if IFF_PROMISC is set. 4) ice I don't know. 5) ocelot_net This driver does not support promiscuous mode with the following explanation: /* This doesn't handle promiscuous mode because the bridge core is * setting IFF_PROMISC on all slave interfaces and all frames would be * forwarded to the CPU port. */ See the already mentioned discussion https://lkml.org/lkml/2019/8/29/255 for more. 6) liquidio This driver also has a HW flag for promiscuous mode. I don't know what it does, though. 7) mvpp2 This driver disables vid filtering if IFF_PROMISC is set. 8) efx /* Disable VLAN filtering by default. It may be enforced if * the feature is fixed (i.e. VLAN filters are required to * receive VLAN tagged packets due to vPort restrictions). */ I did not find a variant that really honors this feature. 9) ef100 I don't know. 10) mlx5 This driver sets a HW promisc flag, adds an `ANY` vlan filter rule and ignores further changes to vlan filtering if IFF_PROMISC is set. 11) nfp This driver uses a HW promisc flag. 12) atlantic I don't know. 13) xlgmac This one has an interesting comment in `xlgmac_set_promiscuous_mode`: /* Hardware will still perform VLAN filtering in promiscuous mode */ if (enable) { xlgmac_disable_rx_vlan_filtering(pdata); } else { if (pdata->netdev->features & NETIF_F_HW_VLAN_CTAG_FILTER) xlgmac_enable_rx_vlan_filtering(pdata); } 14) enetc I think, the feature is overridden everytime in `set_rx_mode` as long as IFF_PROMISC is set. 15) xgbe In `xgbe_set_promiscuous_mode` once again: /* Hardware will still perform VLAN filtering in promiscuous mode */ if (enable) { xgbe_disable_rx_vlan_filtering(pdata); } else { if (pdata->netdev->features & NETIF_F_HW_VLAN_CTAG_FILTER) xgbe_enable_rx_vlan_filtering(pdata); } Implementation ============== I then tried to come up with a solution in net/core that would universally disable vlan filtering in promiscuous mode. Removing the features in `netdev_fix_features` is easily done, but updating the active set of features whenever IFF_PROMISC changes seems hard. `__dev_set_promiscuity` is often called in atomic context but `.ndo_set_features` can sleep in many drivers. Adding a work_queue or similar to every net_device seems clumsy and inappropriate. Rewriting ndo_set_features of all drivers to be atomic is not a task you should ask from me... Calling `netdev_update_features` after every entrypoint that locks the rtnl seems too error-prone and also clumsy. Only updating the features when promiscuity was explicitly requested by userspace in `dev_change_flags` might leave the device in a weird inconsistent state. Continue to let each driver enforce the kernels definition of promiscuity. They know how to update the features atomically. Then I am back at my original patch... I'm afraid, I might need some guidance on how to approach this. Thanks for your help and all of your responses Markus
On Thu, Nov 19, 2020 at 04:37:51PM +0100, Markus Blöchl wrote: > > There might be an entire discussion about how promiscuity does _not_ > > mean "deliver all packets to the CPU" that might be of interest to you: > > https://lkml.org/lkml/2019/8/29/255 > > If I glanced over this discussion correctly, it is about avoiding > promiscuity under certain circumstances (while promiscuity was > only requested for switching and not by userspace) because HW promiscuity > is not needed in that particular case. > > As I currently see it, there are two common use cases for promiscuity: > > 1) Applying filtering, switching and other logic on the CPU. > This could be due to limited resources in the NIC, e.g. when there > are too many unicast addresses configured on that interface or > simply an unavailable hardware capability. > > 2) Sniffing the wire. > > The kernel uses IFF_PROMISC (or `__dev_set_promiscuity()`) for both, > which obviously can be overkill in the first case. > The question remains, what does IFF_PROMISC exactly mean for userspace > (which, I guess, most often uses it for 2)? And that discussion ended with the conclusion that promiscuity is never enough to "sniff the wire". > > > b) - The same as a) > > > - Additionally enumerate and disable all available offloading features > > > > If said offloading features have to do with the CPU not receiving some > > frames any longer, and you _do_ want the CPU to see them, then obviously > > said offloading features should be disabled. This includes not only VLAN > > filtering, but also bridging offload, IP forwarding offload, etc. > > > > I'd say that (b) should be sufficient, but not future-proof in the sense > > that new offloading features might come every day, and they would need > > to be disabled on a case-by-case basis. > > > > For the forwarding offload, there would be no question whatsoever that > > you'd need to turn it off, or add a mirroring rule towards the CPU, or > > do _something_ user-visible, to get that traffic. But as for VLAN > > filtering offload in particular, there's the (not negligible at all) > > precedent created by the bridge driver, that Ido pointed out. That would > > be a decision for the maintainers to make, if the Linux network stack > > creates its own definition of promiscuity which openly contradicts IEEE's. > > One might perhaps try to argue that the VLAN ID is an integral part of > > the station's address (which is true if you look at it from the > > perspective of an IEEE 802.1Q bridge), and therefore promiscuity should > > apply on the VLAN ID too, not just the MAC address. Either way, that > > should be made more clear than it currently is, once a decision is > > taken. > > In that case, maybe new features which could alter user-visible behaviour > should not be enabled by default? > If I am not mistaken, bridging offload, IP forwarding offload and > similar have to be enabled explicitly, at least. Uhm, no? If L2 and L3 forwarding offload are supported by the hardware, the way things work is that said driver simply monitors the user commands and configures itself to seamlessly perform those operations in hardware. There isn't a knob that says "I don't want to do bridging in hardware between these two switchdev interfaces", if the driver implements bridging. > I am not convinced that this definition would indeed contradict the > IEEE standard. It might just be a stronger one with more guarantees. > Assuming you have a very stupid NIC without any filtering or offloading > capabilities, which would always forward all frames to the CPU. > Would this NIC not comply to the standard? I don't understand this. > > > After those are answered, I am open to suggestions on how to fix this > > > differently (if still needed). > > > > Turn off VLAN filtering, or get a commonly accepted definition of > > promiscuity? > > > Other Drivers > ============= > > So I tried to figure out what other existing hardware drivers do > by grepping for drivers which do something on a change to > NETIF_F_HW_VLAN_CTAG_FILTER. > > Here are the results of me trying to understand the drivers quickly. > I hope it's somewhat close and helps: > > 1) aqc111 > This controller has a HW register flag for promiscuous mode. > I am not sure what it does, but since this is another ASIX > device, the documentation from above should apply. > > 2) lan78xx > Here it began ... > > 3) ixgbe > This driver disables HW_VLAN_CTAG_FILTER if IFF_PROMISC is set. > > 4) ice > I don't know. > > 5) ocelot_net > This driver does not support promiscuous mode with the following > explanation: > /* This doesn't handle promiscuous mode because the bridge core is > * setting IFF_PROMISC on all slave interfaces and all frames would be > * forwarded to the CPU port. > */ > See the already mentioned discussion https://lkml.org/lkml/2019/8/29/255 > for more. The explanation for ocelot is bogus. Ocelot is a switch and it doesn't implement IFF_PROMISC precisely because of that: a switch port is promiscuous by definition. Also check out mlxsw_sp_set_rx_mode() in the mlxsw driver. It is deliberately empty for the same reason. > 6) liquidio > This driver also has a HW flag for promiscuous mode. > I don't know what it does, though. > > 7) mvpp2 > This driver disables vid filtering if IFF_PROMISC is set. > > 8) efx > /* Disable VLAN filtering by default. It may be enforced if > * the feature is fixed (i.e. VLAN filters are required to > * receive VLAN tagged packets due to vPort restrictions). > */ > I did not find a variant that really honors this feature. > > 9) ef100 > I don't know. > > 10) mlx5 > This driver sets a HW promisc flag, adds an `ANY` vlan filter rule > and ignores further changes to vlan filtering if IFF_PROMISC is set. > > 11) nfp > This driver uses a HW promisc flag. > > 12) atlantic > I don't know. > > 13) xlgmac > This one has an interesting comment in `xlgmac_set_promiscuous_mode`: > /* Hardware will still perform VLAN filtering in promiscuous mode */ > if (enable) { > xlgmac_disable_rx_vlan_filtering(pdata); > } else { > if (pdata->netdev->features & NETIF_F_HW_VLAN_CTAG_FILTER) > xlgmac_enable_rx_vlan_filtering(pdata); > } > > 14) enetc > I think, the feature is overridden everytime in `set_rx_mode` as long > as IFF_PROMISC is set. This one was added by me in 7070eea5e95a ("enetc: permit configuration of rx-vlan-filter with ethtool"). I'll fully admit that I disabled VLAN filtering when entering IFF_PROMISC out of stupidity, and the main goal of that patch was something different anyway. I didn't even think of checking the IEEE 802.3 standard, but then you had to ask :). I'm sure many more are in the same situation, and it's what led to this chaos. But I don't have any users that rely on this behavior of IFF_PROMISC, and if we were to agree on a definition of promiscuity that does not assume anything about VLAN filtering, I would be more than happy to delete that code. In fact I've been waiting for this to reach a conclusion so that I could do that.
On Thu, 19 Nov 2020 16:37:51 +0100 Markus Blöchl wrote: > Implementation > ============== > > I then tried to come up with a solution in net/core that would > universally disable vlan filtering in promiscuous mode. Thanks for taking a look! > Removing the features in `netdev_fix_features` is easily done, but > updating the active set of features whenever IFF_PROMISC changes > seems hard. > > `__dev_set_promiscuity` is often called in atomic context but > `.ndo_set_features` can sleep in many drivers. Are there paths other than __dev_set_rx_mode() which would call __dev_set_promiscuity() in atomic context? The saving grace about __dev_set_rx_mode() is that it sets promisc explicitly to disable unicast filtering (dev->uc_promisc), so IMO that case (dev->promiscuity == dev->uc_promisc) does not need to disable VLAN filtering. But IDK if that's the only atomic path. > Adding a work_queue or similar to every net_device seems clumsy and > inappropriate. > > Rewriting ndo_set_features of all drivers to be atomic is not a task > you should ask from me... > > Calling `netdev_update_features` after every entrypoint that locks > the rtnl seems too error-prone and also clumsy. > > Only updating the features when promiscuity was explicitly requested > by userspace in `dev_change_flags` might leave the device in a > weird inconsistent state. > > Continue to let each driver enforce the kernels definition of > promiscuity. They know how to update the features atomically. > Then I am back at my original patch... > > I'm afraid, I might need some guidance on how to approach this.
diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index 65b315bc60ab..ac6c0beeac21 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -2324,13 +2324,15 @@ static int lan78xx_set_mac_addr(struct net_device *netdev, void *p) } /* Enable or disable Rx checksum offload engine */ -static int lan78xx_set_features(struct net_device *netdev, - netdev_features_t features) +static void lan78xx_update_features(struct net_device *netdev, + netdev_features_t features) { struct lan78xx_net *dev = netdev_priv(netdev); struct lan78xx_priv *pdata = (struct lan78xx_priv *)(dev->data[0]); unsigned long flags; - int ret; + + if (netdev->flags & IFF_PROMISC) + features &= ~NETIF_F_HW_VLAN_CTAG_FILTER; spin_lock_irqsave(&pdata->rfe_ctl_lock, flags); @@ -2353,12 +2355,30 @@ static int lan78xx_set_features(struct net_device *netdev, pdata->rfe_ctl &= ~RFE_CTL_VLAN_FILTER_; spin_unlock_irqrestore(&pdata->rfe_ctl_lock, flags); +} + +static int lan78xx_set_features(struct net_device *netdev, + netdev_features_t features) +{ + struct lan78xx_net *dev = netdev_priv(netdev); + struct lan78xx_priv *pdata = (struct lan78xx_priv *)(dev->data[0]); + int ret; + + lan78xx_update_features(netdev, features); ret = lan78xx_write_reg(dev, RFE_CTL, pdata->rfe_ctl); return 0; } +static void lan78xx_set_rx_mode(struct net_device *netdev) +{ + /* Enable or disable checksum offload engines */ + lan78xx_update_features(netdev, netdev->features); + + lan78xx_set_multicast(netdev); +} + static void lan78xx_deferred_vlan_write(struct work_struct *param) { struct lan78xx_priv *pdata = @@ -2528,10 +2548,7 @@ static int lan78xx_reset(struct lan78xx_net *dev) pdata->rfe_ctl |= RFE_CTL_BCAST_EN_ | RFE_CTL_DA_PERFECT_; ret = lan78xx_write_reg(dev, RFE_CTL, pdata->rfe_ctl); - /* Enable or disable checksum offload engines */ - lan78xx_set_features(dev->net, dev->net->features); - - lan78xx_set_multicast(dev->net); + lan78xx_set_rx_mode(dev->net); /* reset PHY */ ret = lan78xx_read_reg(dev, PMT_CTL, &buf); @@ -3613,7 +3630,7 @@ static const struct net_device_ops lan78xx_netdev_ops = { .ndo_set_mac_address = lan78xx_set_mac_addr, .ndo_validate_addr = eth_validate_addr, .ndo_do_ioctl = phy_do_ioctl_running, - .ndo_set_rx_mode = lan78xx_set_multicast, + .ndo_set_rx_mode = lan78xx_set_rx_mode, .ndo_set_features = lan78xx_set_features, .ndo_vlan_rx_add_vid = lan78xx_vlan_rx_add_vid, .ndo_vlan_rx_kill_vid = lan78xx_vlan_rx_kill_vid,
The rx-vlan-filter feature flag prevents unexpected tagged frames on the wire from reaching the kernel in promiscuous mode. Disable this offloading feature in the lan7800 controller whenever IFF_PROMISC is set and make sure that the hardware features are updated when IFF_PROMISC changes. Signed-off-by: Markus Blöchl <markus.bloechl@ipetronik.com> --- Notes: When sniffing ethernet using a LAN7800 ethernet controller, vlan-tagged frames are silently dropped by the controller due to the RFE_CTL_VLAN_FILTER flag being set by default since commit 4a27327b156e("net: lan78xx: Add support for VLAN filtering."). In order to receive those tagged frames it is necessary to manually disable rx vlan filtering using ethtool ( `ethtool -K ethX rx-vlan-filter off` or corresponding ioctls ). Setting all bits in the vlan filter table to 1 is an even worse approach, imho. As a user I would probably expect that setting IFF_PROMISC should be enough in all cases to receive all valid traffic. Therefore I think this behaviour is a bug in the driver, since other drivers (notably ixgbe) automatically disable rx-vlan-filter when IFF_PROMISC is set. Please correct me if I am wrong here. drivers/net/usb/lan78xx.c | 33 +++++++++++++++++++++++++-------- 1 file changed, 25 insertions(+), 8 deletions(-) base-commit: 4e0396c59559264442963b349ab71f66e471f84d -- 2.29.2 Impressum/Imprint: https://www.ipetronik.com/impressum