diff mbox series

Revert "teamd: Disregard current state when considering port enablement"

Message ID 68efe533b616cc1b76c6c14bbef354b1a52f4b08.1598939967.git.lucien.xin@gmail.com (mailing list archive)
State New
Headers show
Series Revert "teamd: Disregard current state when considering port enablement" | expand

Commit Message

Xin Long Sept. 1, 2020, 5:59 a.m. UTC
This reverts commit deadb5b715227429a1879b187f5906b39151eca9.

As Patrick noticed, with that commit, teamd_port_check_enable()
would set the team port to the new state unconditionally, which
triggers another change message from kernel to userspace, then
teamd_port_check_enable() is called again to set the team port
to the new state.

This would go around and around to update the team port state,
and even cause teamd to consume 100% cpu.

As the issue caused by that commit is serious, it has to be
reverted. As for the issued fixed by that commit, I would
propose a new fix later.
---
 teamd/teamd_per_port.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Jiri Pirko Sept. 2, 2020, 8:15 a.m. UTC | #1
Tue, Sep 01, 2020 at 07:59:27AM CEST, lucien.xin@gmail.com wrote:
>This reverts commit deadb5b715227429a1879b187f5906b39151eca9.
>
>As Patrick noticed, with that commit, teamd_port_check_enable()
>would set the team port to the new state unconditionally, which
>triggers another change message from kernel to userspace, then
>teamd_port_check_enable() is called again to set the team port
>to the new state.
>
>This would go around and around to update the team port state,
>and even cause teamd to consume 100% cpu.
>
>As the issue caused by that commit is serious, it has to be
>reverted. As for the issued fixed by that commit, I would
>propose a new fix later.

Done. Thanks!

>---
> teamd/teamd_per_port.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
>diff --git a/teamd/teamd_per_port.c b/teamd/teamd_per_port.c
>index 166da57..d429753 100644
>--- a/teamd/teamd_per_port.c
>+++ b/teamd/teamd_per_port.c
>@@ -442,14 +442,18 @@ int teamd_port_check_enable(struct teamd_context *ctx,
> 			    bool should_enable, bool should_disable)
> {
> 	bool new_enabled_state;
>+	bool curr_enabled_state;
> 	int err;
> 
> 	if (!teamd_port_present(ctx, tdport))
> 		return 0;
>+	err = teamd_port_enabled(ctx, tdport, &curr_enabled_state);
>+	if (err)
>+		return err;
> 
>-	if (should_enable)
>+	if (!curr_enabled_state && should_enable)
> 		new_enabled_state = true;
>-	else if (should_disable)
>+	else if (curr_enabled_state && should_disable)
> 		new_enabled_state = false;
> 	else
> 		return 0;
>-- 
>2.1.0
>
Jiri Pirko Sept. 14, 2020, 8:06 a.m. UTC | #2
Tue, Sep 01, 2020 at 07:59:27AM CEST, lucien.xin@gmail.com wrote:
>This reverts commit deadb5b715227429a1879b187f5906b39151eca9.
>
>As Patrick noticed, with that commit, teamd_port_check_enable()
>would set the team port to the new state unconditionally, which
>triggers another change message from kernel to userspace, then
>teamd_port_check_enable() is called again to set the team port
>to the new state.
>
>This would go around and around to update the team port state,
>and even cause teamd to consume 100% cpu.
>
>As the issue caused by that commit is serious, it has to be
>reverted. As for the issued fixed by that commit, I would
>propose a new fix later.

Hi Xin. Any progress with the fix? I would like to do a libteam release
soon but I'm happy to wait couple of days if the fix is in pipes.

Thanks!


>---
> teamd/teamd_per_port.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
>diff --git a/teamd/teamd_per_port.c b/teamd/teamd_per_port.c
>index 166da57..d429753 100644
>--- a/teamd/teamd_per_port.c
>+++ b/teamd/teamd_per_port.c
>@@ -442,14 +442,18 @@ int teamd_port_check_enable(struct teamd_context *ctx,
> 			    bool should_enable, bool should_disable)
> {
> 	bool new_enabled_state;
>+	bool curr_enabled_state;
> 	int err;
> 
> 	if (!teamd_port_present(ctx, tdport))
> 		return 0;
>+	err = teamd_port_enabled(ctx, tdport, &curr_enabled_state);
>+	if (err)
>+		return err;
> 
>-	if (should_enable)
>+	if (!curr_enabled_state && should_enable)
> 		new_enabled_state = true;
>-	else if (should_disable)
>+	else if (curr_enabled_state && should_disable)
> 		new_enabled_state = false;
> 	else
> 		return 0;
>-- 
>2.1.0
>
Xin Long Sept. 15, 2020, 5:39 a.m. UTC | #3
On Mon, Sep 14, 2020 at 4:06 PM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Tue, Sep 01, 2020 at 07:59:27AM CEST, lucien.xin@gmail.com wrote:
> >This reverts commit deadb5b715227429a1879b187f5906b39151eca9.
> >
> >As Patrick noticed, with that commit, teamd_port_check_enable()
> >would set the team port to the new state unconditionally, which
> >triggers another change message from kernel to userspace, then
> >teamd_port_check_enable() is called again to set the team port
> >to the new state.
> >
> >This would go around and around to update the team port state,
> >and even cause teamd to consume 100% cpu.
> >
> >As the issue caused by that commit is serious, it has to be
> >reverted. As for the issued fixed by that commit, I would
> >propose a new fix later.
>
> Hi Xin. Any progress with the fix? I would like to do a libteam release
> soon but I'm happy to wait couple of days if the fix is in pipes.
Hi, Jiri,

I could never reproduce this issue by:

  # ./mirror_gre_lag_lacp.sh veth{5..12}

especially after the kernel patch:

commit 1c0522b4a2e143fa6e55e4bd2308415c81184ec7
Author: Petr Machata <petrm at mellanox.com>
Date:   Fri May 29 14:16:53 2020 +0300

    selftests: forwarding: mirror_lib: Use mausezahn

I will need to add some code to try to reproduce it.
Give me another 2 days, I will let you know soon.

Thanks.


>
> Thanks!
>
>
> >---
> > teamd/teamd_per_port.c | 8 ++++++--
> > 1 file changed, 6 insertions(+), 2 deletions(-)
> >
> >diff --git a/teamd/teamd_per_port.c b/teamd/teamd_per_port.c
> >index 166da57..d429753 100644
> >--- a/teamd/teamd_per_port.c
> >+++ b/teamd/teamd_per_port.c
> >@@ -442,14 +442,18 @@ int teamd_port_check_enable(struct teamd_context *ctx,
> >                           bool should_enable, bool should_disable)
> > {
> >       bool new_enabled_state;
> >+      bool curr_enabled_state;
> >       int err;
> >
> >       if (!teamd_port_present(ctx, tdport))
> >               return 0;
> >+      err = teamd_port_enabled(ctx, tdport, &curr_enabled_state);
> >+      if (err)
> >+              return err;
> >
> >-      if (should_enable)
> >+      if (!curr_enabled_state && should_enable)
> >               new_enabled_state = true;
> >-      else if (should_disable)
> >+      else if (curr_enabled_state && should_disable)
> >               new_enabled_state = false;
> >       else
> >               return 0;
> >--
> >2.1.0
> >
Xin Long Sept. 16, 2020, 10:48 a.m. UTC | #4
On Tue, Sep 15, 2020 at 1:39 PM Xin Long <lucien.xin@gmail.com> wrote:
>
> On Mon, Sep 14, 2020 at 4:06 PM Jiri Pirko <jiri@resnulli.us> wrote:
> >
> > Tue, Sep 01, 2020 at 07:59:27AM CEST, lucien.xin@gmail.com wrote:
> > >This reverts commit deadb5b715227429a1879b187f5906b39151eca9.
> > >
> > >As Patrick noticed, with that commit, teamd_port_check_enable()
> > >would set the team port to the new state unconditionally, which
> > >triggers another change message from kernel to userspace, then
> > >teamd_port_check_enable() is called again to set the team port
> > >to the new state.
> > >
> > >This would go around and around to update the team port state,
> > >and even cause teamd to consume 100% cpu.
> > >
> > >As the issue caused by that commit is serious, it has to be
> > >reverted. As for the issued fixed by that commit, I would
> > >propose a new fix later.
> >
> > Hi Xin. Any progress with the fix? I would like to do a libteam release
> > soon but I'm happy to wait couple of days if the fix is in pipes.
> Hi, Jiri,
>
> I could never reproduce this issue by:
>
>   # ./mirror_gre_lag_lacp.sh veth{5..12}
>
> especially after the kernel patch:
>
> commit 1c0522b4a2e143fa6e55e4bd2308415c81184ec7
> Author: Petr Machata <petrm at mellanox.com>
> Date:   Fri May 29 14:16:53 2020 +0300
>
>     selftests: forwarding: mirror_lib: Use mausezahn
>
> I will need to add some code to try to reproduce it.
> Give me another 2 days, I will let you know soon.
>
Hi, Jiri

Just noticed that the original issue is exactly the same one fixed by:

commit 90e5279ce0241838d5687739b3bc795235b7d53b
Author: Xin Long <lucien.xin@gmail.com>
Date:   Mon Apr 15 16:56:35 2019 +0800

    teamd: use enabled option_changed to sync enabled to link_up for lb runner

pls see the changelog. You can go with the release now.

Thanks.
diff mbox series

Patch

diff --git a/teamd/teamd_per_port.c b/teamd/teamd_per_port.c
index 166da57..d429753 100644
--- a/teamd/teamd_per_port.c
+++ b/teamd/teamd_per_port.c
@@ -442,14 +442,18 @@  int teamd_port_check_enable(struct teamd_context *ctx,
 			    bool should_enable, bool should_disable)
 {
 	bool new_enabled_state;
+	bool curr_enabled_state;
 	int err;
 
 	if (!teamd_port_present(ctx, tdport))
 		return 0;
+	err = teamd_port_enabled(ctx, tdport, &curr_enabled_state);
+	if (err)
+		return err;
 
-	if (should_enable)
+	if (!curr_enabled_state && should_enable)
 		new_enabled_state = true;
-	else if (should_disable)
+	else if (curr_enabled_state && should_disable)
 		new_enabled_state = false;
 	else
 		return 0;