Message ID | 20240206123054.3052966-1-horatiu.vultur@microchip.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v2] lan966x: Fix crash when adding interface under a lag | expand |
On Tue, Feb 06, 2024 at 01:30:54PM +0100, Horatiu Vultur wrote: > There is a crash when adding one of the lan966x interfaces under a lag > interface. The issue can be reproduced like this: > ip link add name bond0 type bond miimon 100 mode balance-xor > ip link set dev eth0 master bond0 > > The reason is because when adding a interface under the lag it would go > through all the ports and try to figure out which other ports are under > that lag interface. And the issue is that lan966x can have ports that are > NULL pointer as they are not probed. So then iterating over these ports > it would just crash as they are NULL pointers. > The fix consists in actually checking for NULL pointers before accessing > something from the ports. Like we do in other places. > > Fixes: cabc9d49333d ("net: lan966x: Add lag support for lan966x") > Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> > Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org>
On Tue, 6 Feb 2024 13:30:54 +0100 Horatiu Vultur wrote: > for (lag = 0; lag < lan966x->num_phys_ports; ++lag) { > - struct net_device *bond = lan966x->ports[lag]->bond; > + struct lan966x_port *port = lan966x->ports[lag]; > int num_active_ports = 0; > + struct net_device *bond; > unsigned long bond_mask; > u8 aggr_idx[16]; > > - if (!bond || (visited & BIT(lag))) > + if (!port || !port->bond || (visited & BIT(lag))) > continue; > > + bond = port->bond; > bond_mask = lan966x_lag_get_mask(lan966x, bond); > > for_each_set_bit(p, &bond_mask, lan966x->num_phys_ports) { > struct lan966x_port *port = lan966x->ports[p]; > > + if (!port) > + continue; Why would lan966x_lag_get_mask() set a bit for a port that doesn't exist? Earlier check makes sense. This one seems too defensive.
The 02/09/2024 13:52, Jakub Kicinski wrote: Hi Jakub, > > On Tue, 6 Feb 2024 13:30:54 +0100 Horatiu Vultur wrote: > > for (lag = 0; lag < lan966x->num_phys_ports; ++lag) { > > - struct net_device *bond = lan966x->ports[lag]->bond; > > + struct lan966x_port *port = lan966x->ports[lag]; > > int num_active_ports = 0; > > + struct net_device *bond; > > unsigned long bond_mask; > > u8 aggr_idx[16]; > > > > - if (!bond || (visited & BIT(lag))) > > + if (!port || !port->bond || (visited & BIT(lag))) > > continue; > > > > + bond = port->bond; > > bond_mask = lan966x_lag_get_mask(lan966x, bond); > > > > for_each_set_bit(p, &bond_mask, lan966x->num_phys_ports) { > > struct lan966x_port *port = lan966x->ports[p]; > > > > + if (!port) > > + continue; > > Why would lan966x_lag_get_mask() set a bit for a port that doesn't > exist? Earlier check makes sense. This one seems too defensive. You are right, the lan966x_lag_get_mask() will not set a bit for a port that doesn't exist[1]. Therefore this check is not needed. [1] https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c#L354 > -- > pw-bot: cr
The 02/12/2024 09:10, Horatiu Vultur wrote: > The 02/09/2024 13:52, Jakub Kicinski wrote: > > Hi Jakub, > > > > > On Tue, 6 Feb 2024 13:30:54 +0100 Horatiu Vultur wrote: > > > for (lag = 0; lag < lan966x->num_phys_ports; ++lag) { > > > - struct net_device *bond = lan966x->ports[lag]->bond; > > > + struct lan966x_port *port = lan966x->ports[lag]; > > > int num_active_ports = 0; > > > + struct net_device *bond; > > > unsigned long bond_mask; > > > u8 aggr_idx[16]; > > > > > > - if (!bond || (visited & BIT(lag))) > > > + if (!port || !port->bond || (visited & BIT(lag))) > > > continue; > > > > > > + bond = port->bond; > > > bond_mask = lan966x_lag_get_mask(lan966x, bond); > > > > > > for_each_set_bit(p, &bond_mask, lan966x->num_phys_ports) { > > > struct lan966x_port *port = lan966x->ports[p]; > > > > > > + if (!port) > > > + continue; > > > > Why would lan966x_lag_get_mask() set a bit for a port that doesn't > > exist? Earlier check makes sense. This one seems too defensive. > > You are right, the lan966x_lag_get_mask() will not set a bit for a port > that doesn't exist[1]. Therefore this check is not needed. > > [1] https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c#L354 While trying to rebase on net, the next version of this patch, I have seen that actually this version was accepted even though it was marked as "Changes Requested". The commit sha is: 15faa1f67ab405d47789d4702f587ec7df7ef03e How do you prefer to go forward from here? - do you want to revert this and then I will send a new version? - should I send a patch that just removes this unneeded check? - any other suggestion? > > > -- > > pw-bot: cr > > -- > /Horatiu
On Mon, 12 Feb 2024 09:32:29 +0100 Horatiu Vultur wrote: > > You are right, the lan966x_lag_get_mask() will not set a bit for a port > > that doesn't exist[1]. Therefore this check is not needed. > > > > [1] https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c#L354 > > While trying to rebase on net, the next version of this patch, I have seen that > actually this version was accepted even though it was marked as "Changes > Requested". > The commit sha is: 15faa1f67ab405d47789d4702f587ec7df7ef03e > > How do you prefer to go forward from here? > - do you want to revert this and then I will send a new version? > - should I send a patch that just removes this unneeded check? > - any other suggestion? Sorry about that, I must have forgotten to reset the tree after viewing and didn't spot this between Brenos patches :S No big deal, let's leave it as is.
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c index 41fa2523d91d3..5f2cd9a8cf8fb 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c @@ -37,19 +37,24 @@ static void lan966x_lag_set_aggr_pgids(struct lan966x *lan966x) /* Now, set PGIDs for each active LAG */ for (lag = 0; lag < lan966x->num_phys_ports; ++lag) { - struct net_device *bond = lan966x->ports[lag]->bond; + struct lan966x_port *port = lan966x->ports[lag]; int num_active_ports = 0; + struct net_device *bond; unsigned long bond_mask; u8 aggr_idx[16]; - if (!bond || (visited & BIT(lag))) + if (!port || !port->bond || (visited & BIT(lag))) continue; + bond = port->bond; bond_mask = lan966x_lag_get_mask(lan966x, bond); for_each_set_bit(p, &bond_mask, lan966x->num_phys_ports) { struct lan966x_port *port = lan966x->ports[p]; + if (!port) + continue; + lan_wr(ANA_PGID_PGID_SET(bond_mask), lan966x, ANA_PGID(p)); if (port->lag_tx_active)