diff mbox series

[net-next,1/4] vxlan: Join / leave MC group after remote changes

Message ID 6986ccd18ece80d1c1adb028972a2bca603b9c11.1738949252.git.petrm@nvidia.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series vxlan: Join / leave MC group when reconfigured | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 2 this patch: 2
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 33 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-02-08--03-00 (tests: 889)

Commit Message

Petr Machata Feb. 7, 2025, 5:34 p.m. UTC
When a vxlan netdevice is brought up, if its default remote is a multicast
address, the device joins the indicated group.

Therefore when the multicast remote address changes, the device should
leave the current group and subscribe to the new one. Similarly when the
interface used for endpoint communication is changed in a situation when
multicast remote is configured. This is currently not done.

Both vxlan_igmp_join() and vxlan_igmp_leave() can however fail. So it is
possible that with such fix, the netdevice will end up in an inconsistent
situation where the old group is not joined anymore, but joining the
new group fails. Should we join the new group first, and leave the old one
second, we might end up in the opposite situation, where both groups are
joined. Undoing any of this during rollback is going to be similarly
problematic.

One solution would be to just forbid the change when the netdevice is up.
However in vnifilter mode, changing the group address is allowed, and these
problems are simply ignored (see vxlan_vni_update_group()):

 # ip link add name br up type bridge vlan_filtering 1
 # ip link add vx1 up master br type vxlan external vnifilter local 192.0.2.1 dev lo dstport 4789
 # bridge vni add dev vx1 vni 200 group 224.0.0.1
 # tcpdump -i lo &
 # bridge vni add dev vx1 vni 200 group 224.0.0.2
 18:55:46.523438 IP 0.0.0.0 > 224.0.0.22: igmp v3 report, 1 group record(s)
 18:55:46.943447 IP 0.0.0.0 > 224.0.0.22: igmp v3 report, 1 group record(s)
 # bridge vni
 dev               vni                group/remote
 vx1               200                224.0.0.2

Having two different modes of operation for conceptually the same interface
is silly, so in this patch, just do what the vnifilter code does and deal
with the errors by crossing fingers real hard.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---

Notes:
CC: Andrew Lunn <andrew+netdev@lunn.ch>
CC: Nikolay Aleksandrov <razor@blackwall.org>
CC: Roopa Prabhu <roopa@nvidia.com>
CC: Menglong Dong <menglong8.dong@gmail.com>
CC: Guillaume Nault <gnault@redhat.com>

 drivers/net/vxlan/vxlan_core.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Comments

Paolo Abeni Feb. 11, 2025, 2:17 p.m. UTC | #1
On 2/7/25 6:34 PM, Petr Machata wrote:
> @@ -3899,6 +3904,11 @@ static void vxlan_config_apply(struct net_device *dev,
>  			dev->mtu = conf->mtu;
>  
>  		vxlan->net = src_net;
> +
> +	} else if (vxlan->dev->flags & IFF_UP) {
> +		if (vxlan_addr_multicast(&vxlan->default_dst.remote_ip) &&
> +		    rem_changed)
> +			vxlan_multicast_leave(vxlan);

AFAICS vxlan_vni_update_group() is not completely ignore
vxlan_multicast_{leave,join} errors. Instead is bailing out as soon as
any error happens. For consistency's sake I think it would be better do
the same here.

Also I have the feeling that ending-up in an inconsistent status with no
group joined would be less troublesome than the opposite.

/P
Petr Machata Feb. 11, 2025, 2:56 p.m. UTC | #2
Paolo Abeni <pabeni@redhat.com> writes:

> On 2/7/25 6:34 PM, Petr Machata wrote:
>> @@ -3899,6 +3904,11 @@ static void vxlan_config_apply(struct net_device *dev,
>>  			dev->mtu = conf->mtu;
>>  
>>  		vxlan->net = src_net;
>> +
>> +	} else if (vxlan->dev->flags & IFF_UP) {
>> +		if (vxlan_addr_multicast(&vxlan->default_dst.remote_ip) &&
>> +		    rem_changed)
>> +			vxlan_multicast_leave(vxlan);
>
> AFAICS vxlan_vni_update_group() is not completely ignore
> vxlan_multicast_{leave,join} errors. Instead is bailing out as soon as
> any error happens. For consistency's sake I think it would be better do
> the same here.
>
> Also I have the feeling that ending-up in an inconsistent status with no
> group joined would be less troublesome than the opposite.

This can already happen FWIW. If you currently want to change the remote
group address in a way that doesn't break things, you take the netdevice
down, then change it, then bring it back up. The leave during downing
can fail and will not be diagnosed. (Nor can it really be, you can't
veto downing.)

I can add the bail-outs that you ask for, but I don't know that there is
a way to resolve these issues for real.
Paolo Abeni Feb. 11, 2025, 3:52 p.m. UTC | #3
On 2/11/25 3:56 PM, Petr Machata wrote:
> Paolo Abeni <pabeni@redhat.com> writes:
>> On 2/7/25 6:34 PM, Petr Machata wrote:
>>> @@ -3899,6 +3904,11 @@ static void vxlan_config_apply(struct net_device *dev,
>>>  			dev->mtu = conf->mtu;
>>>  
>>>  		vxlan->net = src_net;
>>> +
>>> +	} else if (vxlan->dev->flags & IFF_UP) {
>>> +		if (vxlan_addr_multicast(&vxlan->default_dst.remote_ip) &&
>>> +		    rem_changed)
>>> +			vxlan_multicast_leave(vxlan);
>>
>> AFAICS vxlan_vni_update_group() is not completely ignore
>> vxlan_multicast_{leave,join} errors. Instead is bailing out as soon as
>> any error happens. For consistency's sake I think it would be better do
>> the same here.
>>
>> Also I have the feeling that ending-up in an inconsistent status with no
>> group joined would be less troublesome than the opposite.
> 
> This can already happen FWIW. If you currently want to change the remote
> group address in a way that doesn't break things, you take the netdevice
> down, then change it, then bring it back up. The leave during downing
> can fail and will not be diagnosed. (Nor can it really be, you can't
> veto downing.)

I see.

> I can add the bail-outs that you ask for, but I don't know that there is
> a way to resolve these issues for real.

The main point I made was about consistency: making the
vxlan_config_apply() behavior as close as possible to
vxlan_vni_update_group() as stated in the commit message.

Cheers,

Paolo
Petr Machata Feb. 11, 2025, 10:11 p.m. UTC | #4
Paolo Abeni <pabeni@redhat.com> writes:

> On 2/11/25 3:56 PM, Petr Machata wrote:
>> Paolo Abeni <pabeni@redhat.com> writes:
>>> On 2/7/25 6:34 PM, Petr Machata wrote:
>>>> @@ -3899,6 +3904,11 @@ static void vxlan_config_apply(struct net_device *dev,
>>>>  			dev->mtu = conf->mtu;
>>>>  
>>>>  		vxlan->net = src_net;
>>>> +
>>>> +	} else if (vxlan->dev->flags & IFF_UP) {
>>>> +		if (vxlan_addr_multicast(&vxlan->default_dst.remote_ip) &&
>>>> +		    rem_changed)
>>>> +			vxlan_multicast_leave(vxlan);
>>>
>>> AFAICS vxlan_vni_update_group() is not completely ignore
>>> vxlan_multicast_{leave,join} errors. Instead is bailing out as soon as
>>> any error happens. For consistency's sake I think it would be better do
>>> the same here.
>>>
>>> Also I have the feeling that ending-up in an inconsistent status with no
>>> group joined would be less troublesome than the opposite.
>> 
>> This can already happen FWIW. If you currently want to change the remote
>> group address in a way that doesn't break things, you take the netdevice
>> down, then change it, then bring it back up. The leave during downing
>> can fail and will not be diagnosed. (Nor can it really be, you can't
>> veto downing.)
>
> I see.
>
>> I can add the bail-outs that you ask for, but I don't know that there is
>> a way to resolve these issues for real.
>
> The main point I made was about consistency: making the
> vxlan_config_apply() behavior as close as possible to
> vxlan_vni_update_group() as stated in the commit message.

No problem, I'll send a v2.
diff mbox series

Patch

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 69579425107f..7eba0ee7f602 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -3888,6 +3888,11 @@  static void vxlan_config_apply(struct net_device *dev,
 	unsigned short needed_headroom = ETH_HLEN;
 	int max_mtu = ETH_MAX_MTU;
 	u32 flags = conf->flags;
+	bool rem_changed;
+
+	rem_changed = !vxlan_addr_equal(&vxlan->default_dst.remote_ip,
+					&conf->remote_ip) ||
+		      vxlan->default_dst.remote_ifindex != conf->remote_ifindex;
 
 	if (!changelink) {
 		if (flags & VXLAN_F_GPE)
@@ -3899,6 +3904,11 @@  static void vxlan_config_apply(struct net_device *dev,
 			dev->mtu = conf->mtu;
 
 		vxlan->net = src_net;
+
+	} else if (vxlan->dev->flags & IFF_UP) {
+		if (vxlan_addr_multicast(&vxlan->default_dst.remote_ip) &&
+		    rem_changed)
+			vxlan_multicast_leave(vxlan);
 	}
 
 	dst->remote_vni = conf->vni;
@@ -3932,6 +3942,11 @@  static void vxlan_config_apply(struct net_device *dev,
 	dev->needed_headroom = needed_headroom;
 
 	memcpy(&vxlan->cfg, conf, sizeof(*conf));
+
+	if (changelink && vxlan->dev->flags & IFF_UP &&
+	    vxlan_addr_multicast(&vxlan->default_dst.remote_ip) &&
+	    rem_changed)
+		vxlan_multicast_join(vxlan);
 }
 
 static int vxlan_dev_configure(struct net *src_net, struct net_device *dev,