Message ID | 20210719170637.435541-1-razor@blackwall.org (mailing list archive) |
---|---|
Headers | show |
Series | net: bridge: multicast: add vlan support | expand |
Hello: This series was applied to netdev/net-next.git (refs/heads/master): On Mon, 19 Jul 2021 20:06:22 +0300 you wrote: > From: Nikolay Aleksandrov <nikolay@nvidia.com> > > Hi all, > This patchset adds initial per-vlan multicast support, most of the code > deals with moving to multicast context pointers from bridge/port pointers. > That allows us to switch them with the per-vlan contexts when a multicast > packet is being processed and vlan multicast snooping has been enabled. > That is controlled by a global bridge option added in patch 06 which is > off by default (BR_BOOLOPT_MCAST_VLAN_SNOOPING). It is important to note > that this option can change only under RTNL and doesn't require > multicast_lock, so we need to be careful when retrieving mcast contexts > in parallel. For packet processing they are switched only once in > br_multicast_rcv() and then used until the packet has been processed. > For the most part we need these contexts only to read config values and > check if they are disabled. The global mcast state which is maintained > consists of querier and router timers, the rest are config options. > The port mcast state which is maintained consists of query timer and > link to router port list if it's ever marked as a router port. Port > multicast contexts _must_ be used only with their respective global > contexts, that is a bridge port's mcast context must be used only with > bridge's global mcast context and a vlan/port's mcast context must be > used only with that vlan's global mcast context due to the router port > lists. This way a bridge port can be marked as a router in multiple > vlans, but might not be a router in some other vlan. Also this allows us > to have per-vlan querier elections, per-vlan queries and basically the > whole multicast state becomes per-vlan when the option is enabled. > One of the hardest parts is synchronization with vlan's memory > management, that is done through a new vlan flag: BR_VLFLAG_MCAST_ENABLED > which is changed only under multicast_lock. When a vlan is being > destroyed first that flag is removed under the lock, then the multicast > context is torn down which includes waiting for any outstanding context > timers. Since all of the vlan processing depends on BR_VLFLAG_MCAST_ENABLED > it must be checked first if the contexts are vlan and the multicast_lock > has been acquired. That is done by all IGMP/MLD packet processing > functions and timers. When processing a packet we have RCU so the vlan > memory won't be freed, but if the flag is missing we must not process it. > The timers are synchronized in the same way with the addition of waiting > for them to finish in case they are running after removing the flag > under multicast_lock (i.e. they were waiting for the lock). Multicast vlan > snooping requires vlan filtering to be enabled, if it's disabled then > snooping gets automatically disabled, too. BR_VLFLAG_GLOBAL_MCAST_ENABLED > controls if a vlan has BR_VLFLAG_MCAST_ENABLED set which is used in all > vlan disabled checks. We need both flags because one is controlled by > user-space globally (BR_VLFLAG_GLOBAL_MCAST_ENABLED) and the other is > for a particular bridge/vlan or port/vlan entry (BR_VLFLAG_MCAST_ENABLED). > Since the latter is also used for synchronization between the multicast > and vlan code, and also controlled by BR_VLFLAG_GLOBAL_MCAST_ENABLED we > rely on it when checking if a vlan context is disabled. The multicast > fast-path has 3 new bit tests on the cache-hot bridge flags field, I > didn't observe any measurable difference. I haven't forced either > context options to be always disabled when the other type is enabled > because the state consists of timers which either expire (router) or > don't affect the normal operation. Some options, like the mcast querier > one, won't be allowed to change for the disabled context type, that will > come with a future patch-set which adds per-vlan querier control. > > [...] Here is the summary with links: - [net-next,01/15] net: bridge: multicast: factor out port multicast context https://git.kernel.org/netdev/net-next/c/9632233e7de8 - [net-next,02/15] net: bridge: multicast: factor out bridge multicast context https://git.kernel.org/netdev/net-next/c/d3d065c0032b - [net-next,03/15] net: bridge: multicast: use multicast contexts instead of bridge or port https://git.kernel.org/netdev/net-next/c/adc47037a7d5 - [net-next,04/15] net: bridge: vlan: add global and per-port multicast context https://git.kernel.org/netdev/net-next/c/613d61dbef8e - [net-next,05/15] net: bridge: multicast: add vlan state initialization and control https://git.kernel.org/netdev/net-next/c/7b54aaaf53cb - [net-next,06/15] net: bridge: add vlan mcast snooping knob https://git.kernel.org/netdev/net-next/c/f4b7002a7076 - [net-next,07/15] net: bridge: multicast: add helper to get port mcast context from port group https://git.kernel.org/netdev/net-next/c/74edfd483de8 - [net-next,08/15] net: bridge: multicast: use the port group to port context helper https://git.kernel.org/netdev/net-next/c/eb1593a0b4c4 - [net-next,09/15] net: bridge: multicast: check if should use vlan mcast ctx https://git.kernel.org/netdev/net-next/c/4cdd0d10f31d - [net-next,10/15] net: bridge: multicast: add vlan querier and query support https://git.kernel.org/netdev/net-next/c/615cc23e6283 - [net-next,11/15] net: bridge: multicast: include router port vlan id in notifications https://git.kernel.org/netdev/net-next/c/1e9ca45662d6 - [net-next,12/15] net: bridge: vlan: add support for global options https://git.kernel.org/netdev/net-next/c/47ecd2dbd8ec - [net-next,13/15] net: bridge: vlan: add support for dumping global vlan options https://git.kernel.org/netdev/net-next/c/743a53d9636a - [net-next,14/15] net: bridge: vlan: notify when global options change https://git.kernel.org/netdev/net-next/c/9aba624d7cb2 - [net-next,15/15] net: bridge: vlan: add mcast snooping control https://git.kernel.org/netdev/net-next/c/9dee572c3848 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
Hi Hik, everyone! On Mon, Jul 19, 2021 at 20:06, Nikolay Aleksandrov <razor@blackwall.org> wrote: > From: Nikolay Aleksandrov <nikolay@nvidia.com> > This patchset adds initial per-vlan multicast support, most of the code > deals with moving to multicast context pointers from bridge/port pointers. Awesome work, this looks very interesting! :) I've already built and tested net-next for regressions on Marvell SOHO switches, looking good so far. Curious, are you planning querier per-vlan, including use-ifaddr support as well? In our in-house hack, which I posted a few years ago, we added some "dumpster diving" to inet_select_addr(), but it got rather tricky. So I've been leaning towards having that in userspace instead. > Future patch-sets which build on this one (in order): > - iproute2 support for all the new uAPIs I'm very eager to try out all the new IGMP per-VLAN stuff, do you have any branch of the iproute2 support available yet for testing? For now I've hard-coded BROPT_MCAST_VLAN_SNOOPING_ENABLED in br_multicast_init() as a workaround, and everything seems to work just as expected :-) Best regards /Joachim
On 19/08/2021 19:01, Joachim Wiberg wrote: > Hi Hik, everyone! > Hi, > On Mon, Jul 19, 2021 at 20:06, Nikolay Aleksandrov <razor@blackwall.org> wrote: >> From: Nikolay Aleksandrov <nikolay@nvidia.com> >> This patchset adds initial per-vlan multicast support, most of the code >> deals with moving to multicast context pointers from bridge/port pointers. > > Awesome work, this looks very interesting! :) I've already built and > tested net-next for regressions on Marvell SOHO switches, looking good > so far. > > Curious, are you planning querier per-vlan, including use-ifaddr support > as well? In our in-house hack, which I posted a few years ago, we added > some "dumpster diving" to inet_select_addr(), but it got rather tricky. > So I've been leaning towards having that in userspace instead. > Yes, that is already supported (use-ifaddr needs attention though). In my next patch-set where I added the initial global vlan mcast options I added control for per-vlan querier with per-vlan querier elections and so on. The use-ifaddr needs more work though, that's why I still haven't added that option. I need to add the per-vlan/port router control option so we'll have mostly everything ready in a single release. >> Future patch-sets which build on this one (in order): >> - iproute2 support for all the new uAPIs > > I'm very eager to try out all the new IGMP per-VLAN stuff, do you have > any branch of the iproute2 support available yet for testing? For now > I've hard-coded BROPT_MCAST_VLAN_SNOOPING_ENABLED in br_multicast_init() > as a workaround, and everything seems to work just as expected :-) I don't have it public yet because I need to polish the support, currently it's very rough, enough for testing purposes for these patch-sets. :) I plan to work on that after I finish with the per-vlan/port router control. > > Best regards > /Joachim > Thanks, Nik
On Thu, Aug 19, 2021 at 19:22, Nikolay Aleksandrov <razor@blackwall.org> wrote: > On 19/08/2021 19:01, Joachim Wiberg wrote: >> On Mon, Jul 19, 2021 at 20:06, Nikolay Aleksandrov <razor@blackwall.org> wrote: >>> From: Nikolay Aleksandrov <nikolay@nvidia.com> >> Curious, are you planning querier per-vlan, including use-ifaddr support >> as well? In our in-house hack, which I posted a few years ago, we added >> some "dumpster diving" to inet_select_addr(), but it got rather tricky. >> So I've been leaning towards having that in userspace instead. > Yes, that is already supported (use-ifaddr needs attention though). In my next > patch-set where I added the initial global vlan mcast options I added control > for per-vlan querier with per-vlan querier elections and so on. The use-ifaddr > needs more work though, that's why I still haven't added that option. I need > to add the per-vlan/port router control option so we'll have mostly everything > ready in a single release. Wow, OK now we're talking, yeah that would be great to have in place as well! >>> Future patch-sets which build on this one (in order): >>> - iproute2 support for all the new uAPIs >> I'm very eager to try out all the new IGMP per-VLAN stuff, do you have >> any branch of the iproute2 support available yet for testing? > I don't have it public yet because I need to polish the support, currently > it's very rough, enough for testing purposes for these patch-sets. :) > I plan to work on that after I finish with the per-vlan/port router control. Alright, I can appreciate that. Really looking forward to this, I'll be patiently waiting here in the wings, testing this out. Fantastic work with this, again! :) All the best /Joachim
From: Nikolay Aleksandrov <nikolay@nvidia.com> Hi all, This patchset adds initial per-vlan multicast support, most of the code deals with moving to multicast context pointers from bridge/port pointers. That allows us to switch them with the per-vlan contexts when a multicast packet is being processed and vlan multicast snooping has been enabled. That is controlled by a global bridge option added in patch 06 which is off by default (BR_BOOLOPT_MCAST_VLAN_SNOOPING). It is important to note that this option can change only under RTNL and doesn't require multicast_lock, so we need to be careful when retrieving mcast contexts in parallel. For packet processing they are switched only once in br_multicast_rcv() and then used until the packet has been processed. For the most part we need these contexts only to read config values and check if they are disabled. The global mcast state which is maintained consists of querier and router timers, the rest are config options. The port mcast state which is maintained consists of query timer and link to router port list if it's ever marked as a router port. Port multicast contexts _must_ be used only with their respective global contexts, that is a bridge port's mcast context must be used only with bridge's global mcast context and a vlan/port's mcast context must be used only with that vlan's global mcast context due to the router port lists. This way a bridge port can be marked as a router in multiple vlans, but might not be a router in some other vlan. Also this allows us to have per-vlan querier elections, per-vlan queries and basically the whole multicast state becomes per-vlan when the option is enabled. One of the hardest parts is synchronization with vlan's memory management, that is done through a new vlan flag: BR_VLFLAG_MCAST_ENABLED which is changed only under multicast_lock. When a vlan is being destroyed first that flag is removed under the lock, then the multicast context is torn down which includes waiting for any outstanding context timers. Since all of the vlan processing depends on BR_VLFLAG_MCAST_ENABLED it must be checked first if the contexts are vlan and the multicast_lock has been acquired. That is done by all IGMP/MLD packet processing functions and timers. When processing a packet we have RCU so the vlan memory won't be freed, but if the flag is missing we must not process it. The timers are synchronized in the same way with the addition of waiting for them to finish in case they are running after removing the flag under multicast_lock (i.e. they were waiting for the lock). Multicast vlan snooping requires vlan filtering to be enabled, if it's disabled then snooping gets automatically disabled, too. BR_VLFLAG_GLOBAL_MCAST_ENABLED controls if a vlan has BR_VLFLAG_MCAST_ENABLED set which is used in all vlan disabled checks. We need both flags because one is controlled by user-space globally (BR_VLFLAG_GLOBAL_MCAST_ENABLED) and the other is for a particular bridge/vlan or port/vlan entry (BR_VLFLAG_MCAST_ENABLED). Since the latter is also used for synchronization between the multicast and vlan code, and also controlled by BR_VLFLAG_GLOBAL_MCAST_ENABLED we rely on it when checking if a vlan context is disabled. The multicast fast-path has 3 new bit tests on the cache-hot bridge flags field, I didn't observe any measurable difference. I haven't forced either context options to be always disabled when the other type is enabled because the state consists of timers which either expire (router) or don't affect the normal operation. Some options, like the mcast querier one, won't be allowed to change for the disabled context type, that will come with a future patch-set which adds per-vlan querier control. Another important addition is the global vlan options, so far we had only per bridge/port vlan options but in order to control vlan multicast snooping globally we need to add a new type of global vlan options. They can be changed only on the bridge device and are dumped only when a special flag is set in the dump request. The first global option is vlan mcast snooping control, it controls the vlan BR_VLFLAG_GLOBAL_MCAST_ENABLED private flag. It can be set only on master vlan entries. There will be many more global vlan options in the future both for multicast config and other per-vlan options (e.g. STP). There's a lot of room for improvements, I'll do some of the initial ones but splitting the state to different contexts opens the door for a lot more. Also any new multicast options become vlan-supported with very little to no effort by using the same contexts. Short patch description: patches 01-04: initial mcast context add, no functional changes patch 05: adds vlan mcast init and control helpers and uses them on vlan create/destroy patch 06: adds a global bridge mcast vlan snooping knob (default off) patches 07-08: add a helper for users which must derive the contexts based on current bridge and vlan options (e.g. timers) patch 09: adds checks for disabled vlan contexts in packet processing and timers patch 10: adds support for per-vlan querier and tagged queries patch 11: adds router port vlan id in the notifications patches 12-14: add global vlan options support (change, dump, notify) patch 15: adds per-vlan global mcast snooping control Future patch-sets which build on this one (in order): - vlan state mcast handling - user-space mdb contexts (currently only the bridge contexts are used there) - all bridge multicast config options added per-vlan global and per vlan/port - iproute2 support for all the new uAPIs - selftests This set has been stress-tested (deleting/adding ports/vlans while changing vlan mcast snooping while processing IGMP/MLD packets), and also has passed all bridge self-tests. I'm sending this set as early as possible since there're a few more related sets that should go in the same release to get proper and full mcast vlan snooping support. Thanks, Nik Nikolay Aleksandrov (15): net: bridge: multicast: factor out port multicast context net: bridge: multicast: factor out bridge multicast context net: bridge: multicast: use multicast contexts instead of bridge or port net: bridge: vlan: add global and per-port multicast context net: bridge: multicast: add vlan state initialization and control net: bridge: add vlan mcast snooping knob net: bridge: multicast: add helper to get port mcast context from port group net: bridge: multicast: use the port group to port context helper net: bridge: multicast: check if should use vlan mcast ctx net: bridge: multicast: add vlan querier and query support net: bridge: multicast: include router port vlan id in notifications net: bridge: vlan: add support for global options net: bridge: vlan: add support for dumping global vlan options net: bridge: vlan: notify when global options change net: bridge: vlan: add mcast snooping control include/uapi/linux/if_bridge.h | 18 + net/bridge/br.c | 9 +- net/bridge/br_device.c | 14 +- net/bridge/br_forward.c | 7 +- net/bridge/br_input.c | 17 +- net/bridge/br_mdb.c | 64 +- net/bridge/br_multicast.c | 1656 ++++++++++++++++++----------- net/bridge/br_multicast_eht.c | 92 +- net/bridge/br_netlink.c | 41 +- net/bridge/br_private.h | 355 +++++-- net/bridge/br_private_mcast_eht.h | 3 +- net/bridge/br_sysfs_br.c | 38 +- net/bridge/br_sysfs_if.c | 2 +- net/bridge/br_vlan.c | 85 +- net/bridge/br_vlan_options.c | 216 ++++ 15 files changed, 1791 insertions(+), 826 deletions(-)