diff mbox series

[net,v2] net: fix stack overflow when LRO is disabled for virtual interfaces

Message ID 20230517143010.3596250-1-ap420073@gmail.com (mailing list archive)
State Accepted
Commit ae9b15fbe63447bc1d3bba3769f409d17ca6fdf6
Delegated to: Netdev Maintainers
Headers show
Series [net,v2] net: fix stack overflow when LRO is disabled for virtual interfaces | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 19 this patch: 19
netdev/cc_maintainers success CCed 9 of 9 maintainers
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 19 this patch: 19
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 53 lines checked
netdev/kdoc success Errors and warnings before: 2 this patch: 2
netdev/source_inline success Was 0 now: 0

Commit Message

Taehee Yoo May 17, 2023, 2:30 p.m. UTC
When the virtual interface's feature is updated, it synchronizes the
updated feature for its own lower interface.
This propagation logic should be worked as the iteration, not recursively.
But it works recursively due to the netdev notification unexpectedly.
This problem occurs when it disables LRO only for the team and bonding
interface type.

       team0
         |
  +------+------+-----+-----+
  |      |      |     |     |
team1  team2  team3  ...  team200

If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
event to its own lower interfaces(team1 ~ team200).
It is worked by netdev_sync_lower_features().
So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
work iteratively.
But generated NETDEV_FEAT_CHANGE event is also sent to the upper
interface too.
upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own
lower interfaces again.
lower and upper interfaces receive this event and generate this
event again and again.
So, the stack overflow occurs.

But it is not the infinite loop issue.
Because the netdev_sync_lower_features() updates features before
generating the NETDEV_FEAT_CHANGE event.
Already synchronized lower interfaces skip notification logic.
So, it is just the problem that iteration logic is changed to the
recursive unexpectedly due to the notification mechanism.

Reproducer:

ip link add team0 type team
ethtool -K team0 lro on
for i in {1..200}
do
        ip link add team$i master team0 type team
        ethtool -K team$i lro on
done

ethtool -K team0 lro off

In order to fix it, the notifier_ctx member of bonding/team is introduced.

Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---

v2:
 - Add new member to struct bonding/team instead of the net_device.

 drivers/net/bonding/bond_main.c | 8 +++++++-
 drivers/net/team/team.c         | 7 ++++++-
 include/linux/if_team.h         | 1 +
 include/net/bonding.h           | 1 +
 4 files changed, 15 insertions(+), 2 deletions(-)

Comments

Eric Dumazet May 17, 2023, 2:59 p.m. UTC | #1
On Wed, May 17, 2023 at 4:30 PM Taehee Yoo <ap420073@gmail.com> wrote:
>
> When the virtual interface's feature is updated, it synchronizes the
> updated feature for its own lower interface.
> This propagation logic should be worked as the iteration, not recursively.
> But it works recursively due to the netdev notification unexpectedly.
> This problem occurs when it disables LRO only for the team and bonding
> interface type.
>
>        team0
>          |
>   +------+------+-----+-----+
>   |      |      |     |     |
> team1  team2  team3  ...  team200
>
> If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
> event to its own lower interfaces(team1 ~ team200).
> It is worked by netdev_sync_lower_features().
> So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
> work iteratively.
> But generated NETDEV_FEAT_CHANGE event is also sent to the upper
> interface too.
> upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own
> lower interfaces again.
> lower and upper interfaces receive this event and generate this
> event again and again.
> So, the stack overflow occurs.
>
> But it is not the infinite loop issue.
> Because the netdev_sync_lower_features() updates features before
> generating the NETDEV_FEAT_CHANGE event.
> Already synchronized lower interfaces skip notification logic.
> So, it is just the problem that iteration logic is changed to the
> recursive unexpectedly due to the notification mechanism.
>
> Reproducer:
>
> ip link add team0 type team
> ethtool -K team0 lro on
> for i in {1..200}
> do
>         ip link add team$i master team0 type team
>         ethtool -K team$i lro on
> done
>
> ethtool -K team0 lro off
>
> In order to fix it, the notifier_ctx member of bonding/team is introduced.
>
> Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
> Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack")
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

Thanks.
Nikolay Aleksandrov May 17, 2023, 3:02 p.m. UTC | #2
On 17/05/2023 17:30, Taehee Yoo wrote:
> When the virtual interface's feature is updated, it synchronizes the
> updated feature for its own lower interface.
> This propagation logic should be worked as the iteration, not recursively.
> But it works recursively due to the netdev notification unexpectedly.
> This problem occurs when it disables LRO only for the team and bonding
> interface type.
> 
>        team0
>          |
>   +------+------+-----+-----+
>   |      |      |     |     |
> team1  team2  team3  ...  team200
> 
> If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
> event to its own lower interfaces(team1 ~ team200).
> It is worked by netdev_sync_lower_features().
> So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
> work iteratively.
> But generated NETDEV_FEAT_CHANGE event is also sent to the upper
> interface too.
> upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own
> lower interfaces again.
> lower and upper interfaces receive this event and generate this
> event again and again.
> So, the stack overflow occurs.
> 
> But it is not the infinite loop issue.
> Because the netdev_sync_lower_features() updates features before
> generating the NETDEV_FEAT_CHANGE event.
> Already synchronized lower interfaces skip notification logic.
> So, it is just the problem that iteration logic is changed to the
> recursive unexpectedly due to the notification mechanism.
> 
> Reproducer:
> 
> ip link add team0 type team
> ethtool -K team0 lro on
> for i in {1..200}
> do
>         ip link add team$i master team0 type team
>         ethtool -K team$i lro on
> done
> 
> ethtool -K team0 lro off
> 
> In order to fix it, the notifier_ctx member of bonding/team is introduced.
> 
> Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
> Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack")
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
> ---
> 
> v2:
>  - Add new member to struct bonding/team instead of the net_device.
> 
>  drivers/net/bonding/bond_main.c | 8 +++++++-
>  drivers/net/team/team.c         | 7 ++++++-
>  include/linux/if_team.h         | 1 +
>  include/net/bonding.h           | 1 +
>  4 files changed, 15 insertions(+), 2 deletions(-)
> 
LGTM
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Jakub Kicinski May 17, 2023, 4:15 p.m. UTC | #3
On Wed, 17 May 2023 14:30:10 +0000 Taehee Yoo wrote:
> When the virtual interface's feature is updated, it synchronizes the
> updated feature for its own lower interface.
> This propagation logic should be worked as the iteration, not recursively.
> But it works recursively due to the netdev notification unexpectedly.
> This problem occurs when it disables LRO only for the team and bonding
> interface type.
> 
>        team0
>          |
>   +------+------+-----+-----+
>   |      |      |     |     |
> team1  team2  team3  ...  team200
> 
> If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
> event to its own lower interfaces(team1 ~ team200).
> It is worked by netdev_sync_lower_features().
> So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
> work iteratively.
> But generated NETDEV_FEAT_CHANGE event is also sent to the upper
> interface too.
> upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own
> lower interfaces again.
> lower and upper interfaces receive this event and generate this
> event again and again.
> So, the stack overflow occurs.
> 
> But it is not the infinite loop issue.
> Because the netdev_sync_lower_features() updates features before
> generating the NETDEV_FEAT_CHANGE event.
> Already synchronized lower interfaces skip notification logic.

Why doesn't the (already synchronized) upper not skip the update?

> So, it is just the problem that iteration logic is changed to the
> recursive unexpectedly due to the notification mechanism.
> 
> Reproducer:
> 
> ip link add team0 type team
> ethtool -K team0 lro on
> for i in {1..200}
> do
>         ip link add team$i master team0 type team
>         ethtool -K team$i lro on
> done
> 
> ethtool -K team0 lro off
> 
> In order to fix it, the notifier_ctx member of bonding/team is introduced.
Taehee Yoo May 17, 2023, 5:28 p.m. UTC | #4
On 5/18/23 01:15, Jakub Kicinski wrote:

Hi Jakub,
Thank you so much for the review!

 > On Wed, 17 May 2023 14:30:10 +0000 Taehee Yoo wrote:
 >> When the virtual interface's feature is updated, it synchronizes the
 >> updated feature for its own lower interface.
 >> This propagation logic should be worked as the iteration, not 
recursively.
 >> But it works recursively due to the netdev notification unexpectedly.
 >> This problem occurs when it disables LRO only for the team and bonding
 >> interface type.
 >>
 >>         team0
 >>           |
 >>    +------+------+-----+-----+
 >>    |      |      |     |     |
 >> team1  team2  team3  ...  team200
 >>
 >> If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
 >> event to its own lower interfaces(team1 ~ team200).
 >> It is worked by netdev_sync_lower_features().
 >> So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
 >> work iteratively.
 >> But generated NETDEV_FEAT_CHANGE event is also sent to the upper
 >> interface too.
 >> upper interface(team0) generates the NETDEV_FEAT_CHANGE event for 
its own
 >> lower interfaces again.
 >> lower and upper interfaces receive this event and generate this
 >> event again and again.
 >> So, the stack overflow occurs.
 >>
 >> But it is not the infinite loop issue.
 >> Because the netdev_sync_lower_features() updates features before
 >> generating the NETDEV_FEAT_CHANGE event.
 >> Already synchronized lower interfaces skip notification logic.
 >
 > Why doesn't the (already synchronized) upper not skip the update?

The skipping logic of this is existing in the netdev_sync_lower_features().
The purpose of this is to synchronize the lower interfaces, not the 
upper interface.
Actually, there is no upper-only synchronization logic.

Both bonding and team interfaces rely on notification mechanisms to work 
their own logic such as synchronization.
The notification is a broadcasting mechanism.
So, both lower and upper receive this event, and it works its own 
notification handling.
But the notification mechanism currently doesn't have options such as 
filtering and these interfaces receive this event with updated feature 
flags.
So, the upper interface can't distinguish whether the received event is 
the first event or duplicated event.

 >
 >> So, it is just the problem that iteration logic is changed to the
 >> recursive unexpectedly due to the notification mechanism.
 >>
 >> Reproducer:
 >>
 >> ip link add team0 type team
 >> ethtool -K team0 lro on
 >> for i in {1..200}
 >> do
 >>          ip link add team$i master team0 type team
 >>          ethtool -K team$i lro on
 >> done
 >>
 >> ethtool -K team0 lro off
 >>
 >> In order to fix it, the notifier_ctx member of bonding/team is 
introduced.
 >

Thank you so much!
Taehee Yoo
Jakub Kicinski May 17, 2023, 6:45 p.m. UTC | #5
On Thu, 18 May 2023 02:28:29 +0900 Taehee Yoo wrote:
>  > Why doesn't the (already synchronized) upper not skip the update?  
> 
> The skipping logic of this is existing in the netdev_sync_lower_features().
> The purpose of this is to synchronize the lower interfaces, not the 
> upper interface.
> Actually, there is no upper-only synchronization logic.
>
> Both bonding and team interfaces rely on notification mechanisms to work 
> their own logic such as synchronization.
> The notification is a broadcasting mechanism.
> So, both lower and upper receive this event, and it works its own 
> notification handling.

This is all true.

> But the notification mechanism currently doesn't have options such as 
> filtering and these interfaces receive this event with updated feature 
> flags.

We don't have to filter notifications.

> So, the upper interface can't distinguish whether the received event is 
> the first event or duplicated event.

What I was thinking was basically why does __netdev_update_features()
not return early if it made no changes? Looking thru the history this
behavior has been created by commit e7868a85e1b26bcb2e. Can we revert
that and fix the problem of syncing features on new ports differently?
Taehee Yoo May 19, 2023, 6:25 a.m. UTC | #6
On 5/18/23 03:45, Jakub Kicinski wrote:
 > On Thu, 18 May 2023 02:28:29 +0900 Taehee Yoo wrote:
 >>   > Why doesn't the (already synchronized) upper not skip the update?
 >>
 >> The skipping logic of this is existing in the 
netdev_sync_lower_features().
 >> The purpose of this is to synchronize the lower interfaces, not the
 >> upper interface.
 >> Actually, there is no upper-only synchronization logic.
 >>
 >> Both bonding and team interfaces rely on notification mechanisms to work
 >> their own logic such as synchronization.
 >> The notification is a broadcasting mechanism.
 >> So, both lower and upper receive this event, and it works its own
 >> notification handling.
 >
 > This is all true.
 >
 >> But the notification mechanism currently doesn't have options such as
 >> filtering and these interfaces receive this event with updated feature
 >> flags.
 >
 > We don't have to filter notifications.
 >
 >> So, the upper interface can't distinguish whether the received event is
 >> the first event or duplicated event.
 >
 > What I was thinking was basically why does __netdev_update_features()
 > not return early if it made no changes? Looking thru the history this
 > behavior has been created by commit e7868a85e1b26bcb2e. Can we revert
 > that and fix the problem of syncing features on new ports differently?

I think this is the best approach so I tried it with existing variables 
such as dev->features, dev->wanted_features, But I couldn't find to fix it.
Because the upper interface' feature flag is updated at the end of it.
So, __netdev_update_features() is called always with the old 
upper-interface' features flag.
__netdev_update_features()
     netdev_sync_lower_features()
        __netdev_update_features()
            netdev_sync_lower_features()
...
            dev->features = features;
        dev->features = features;
     dev->features = features;
dev->features = features;

In order to return __netdev_update_features() early in duplicated call,
__netdev_update_features() should update dev->features ealier than 
netdev_sync_lower_features() call.
So, the current code doesn't update dev->features early so it can't 
check duplicated calls with dev->features.

I tested this approach with a revert of e7868a85e1b26bcb2e and the below 
change.

diff --git a/net/core/dev.c b/net/core/dev.c
index 6b12d8a9d463..f051c293ffaa 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -9758,6 +9758,9 @@ int __netdev_update_features(struct net_device *dev)
                 return -1;
         }

+       if (netif_is_bond_master(dev) || netif_is_team_master(dev))
+               dev->features = features;
+
         /* some features must be disabled on lower devices when disabled
          * on an upper device (think: bonding master or bridge)
          */

It fixes the stack overflow problem, but I'm not sure whether updating 
it before netdev_sync_lower_features() is safe or not.
Jakub Kicinski May 19, 2023, 9:31 p.m. UTC | #7
On Fri, 19 May 2023 15:25:12 +0900 Taehee Yoo wrote:
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 6b12d8a9d463..f051c293ffaa 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -9758,6 +9758,9 @@ int __netdev_update_features(struct net_device *dev)
>                  return -1;
>          }
> 
> +       if (netif_is_bond_master(dev) || netif_is_team_master(dev))
> +               dev->features = features;
> +
>          /* some features must be disabled on lower devices when disabled
>           * on an upper device (think: bonding master or bridge)
>           */
> 
> It fixes the stack overflow problem, but I'm not sure whether updating 
> it before netdev_sync_lower_features() is safe or not.

Indeed, I don't think we can do this, udp_tunnel_drop_rx_info()
will get confused for example. Let me just apply the patch as is..
patchwork-bot+netdevbpf@kernel.org May 20, 2023, 5:50 a.m. UTC | #8
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 17 May 2023 14:30:10 +0000 you wrote:
> When the virtual interface's feature is updated, it synchronizes the
> updated feature for its own lower interface.
> This propagation logic should be worked as the iteration, not recursively.
> But it works recursively due to the netdev notification unexpectedly.
> This problem occurs when it disables LRO only for the team and bonding
> interface type.
> 
> [...]

Here is the summary with links:
  - [net,v2] net: fix stack overflow when LRO is disabled for virtual interfaces
    https://git.kernel.org/netdev/net/c/ae9b15fbe634

You are awesome, thank you!
diff mbox series

Patch

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 3fed888629f7..edbaa1444f8e 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3947,7 +3947,11 @@  static int bond_slave_netdev_event(unsigned long event,
 		unblock_netpoll_tx();
 		break;
 	case NETDEV_FEAT_CHANGE:
-		bond_compute_features(bond);
+		if (!bond->notifier_ctx) {
+			bond->notifier_ctx = true;
+			bond_compute_features(bond);
+			bond->notifier_ctx = false;
+		}
 		break;
 	case NETDEV_RESEND_IGMP:
 		/* Propagate to master device */
@@ -6342,6 +6346,8 @@  static int bond_init(struct net_device *bond_dev)
 	if (!bond->wq)
 		return -ENOMEM;
 
+	bond->notifier_ctx = false;
+
 	spin_lock_init(&bond->stats_lock);
 	netdev_lockdep_set_classes(bond_dev);
 
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index d10606f257c4..555b0b1e9a78 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1629,6 +1629,7 @@  static int team_init(struct net_device *dev)
 
 	team->dev = dev;
 	team_set_no_mode(team);
+	team->notifier_ctx = false;
 
 	team->pcpu_stats = netdev_alloc_pcpu_stats(struct team_pcpu_stats);
 	if (!team->pcpu_stats)
@@ -3022,7 +3023,11 @@  static int team_device_event(struct notifier_block *unused,
 		team_del_slave(port->team->dev, dev);
 		break;
 	case NETDEV_FEAT_CHANGE:
-		team_compute_features(port->team);
+		if (!port->team->notifier_ctx) {
+			port->team->notifier_ctx = true;
+			team_compute_features(port->team);
+			port->team->notifier_ctx = false;
+		}
 		break;
 	case NETDEV_PRECHANGEMTU:
 		/* Forbid to change mtu of underlaying device */
diff --git a/include/linux/if_team.h b/include/linux/if_team.h
index fc985e5c739d..8de6b6e67829 100644
--- a/include/linux/if_team.h
+++ b/include/linux/if_team.h
@@ -208,6 +208,7 @@  struct team {
 	bool queue_override_enabled;
 	struct list_head *qom_lists; /* array of queue override mapping lists */
 	bool port_mtu_change_allowed;
+	bool notifier_ctx;
 	struct {
 		unsigned int count;
 		unsigned int interval; /* in ms */
diff --git a/include/net/bonding.h b/include/net/bonding.h
index 0efef2a952b7..59955ac33157 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -221,6 +221,7 @@  struct bonding {
 	struct   bond_up_slave __rcu *usable_slaves;
 	struct   bond_up_slave __rcu *all_slaves;
 	bool     force_primary;
+	bool     notifier_ctx;
 	s32      slave_cnt; /* never change this value outside the attach/detach wrappers */
 	int     (*recv_probe)(const struct sk_buff *, struct bonding *,
 			      struct slave *);