diff mbox series

teamd: lacp: update port state according to partner's sync bit

Message ID 20190130100213.28960-1-liuhangbin@gmail.com (mailing list archive)
State New
Headers show
Series teamd: lacp: update port state according to partner's sync bit | expand

Commit Message

Hangbin Liu Jan. 30, 2019, 10:02 a.m. UTC
1) According to 6.4.12 of IEEE 802.1AX-2008, Figure 6-18, if the partner's sync
bit is not set, the lacp port state should be set to EXPIRED.

Function lacpdu_recv() makes the state of the lacp port to PORT_STATE_CURRENT
when the teamd receives any valid lacp packet. Fix it by checking the parter's
sync bit.

2) According to 6.4.15 of IEEE 802.1AX-2008, Figure 6-22, the state that the
port is selected moves MUX state from DETACHED to ATTACHED.

But ATTACHED sate does not mean that the port can send and receive user frames.
COLLECTING_DISTRIBUTION state is the sate that the port can send and receive
user frames. To move MUX state from ATTACHED to COLLECTING_DISTRIBUTION, the
partner state should be sync as well as the port selected.

In function lacp_port_actor_update(), only INFO_STATE_SYNCHRONIZATION should
be set to the actor.state when the port is selected. INFO_STATE_COLLECTING and
INFO_STATE_DISTRIBUTING should be set to false with ATTACHED mode and set to
true when INFO_STATE_SYNCHRONIZATION of partner.state is set.

In function lacp_port_should_be_{enabled, disabled}(), we also need to check
the INFO_STATE_SYNCHRONIZATION bit of partner.state.

Reported-by: Cisco Systems
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 teamd/teamd_runner_lacp.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

Comments

Hangbin Liu Feb. 28, 2019, 9:42 a.m. UTC | #1
On Wed, 30 Jan 2019 at 18:02, Hangbin Liu <liuhangbin@gmail.com> wrote:
>
> 1) According to 6.4.12 of IEEE 802.1AX-2008, Figure 6-18, if the partner's sync
> bit is not set, the lacp port state should be set to EXPIRED.
>
> Function lacpdu_recv() makes the state of the lacp port to PORT_STATE_CURRENT
> when the teamd receives any valid lacp packet. Fix it by checking the parter's
> sync bit.
>
> 2) According to 6.4.15 of IEEE 802.1AX-2008, Figure 6-22, the state that the
> port is selected moves MUX state from DETACHED to ATTACHED.
>
> But ATTACHED sate does not mean that the port can send and receive user frames.
> COLLECTING_DISTRIBUTION state is the sate that the port can send and receive
> user frames. To move MUX state from ATTACHED to COLLECTING_DISTRIBUTION, the
> partner state should be sync as well as the port selected.
>
> In function lacp_port_actor_update(), only INFO_STATE_SYNCHRONIZATION should
> be set to the actor.state when the port is selected. INFO_STATE_COLLECTING and
> INFO_STATE_DISTRIBUTING should be set to false with ATTACHED mode and set to
> true when INFO_STATE_SYNCHRONIZATION of partner.state is set.
>
> In function lacp_port_should_be_{enabled, disabled}(), we also need to check
> the INFO_STATE_SYNCHRONIZATION bit of partner.state.
>
> Reported-by: Cisco Systems
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> ---
>

Hi Jiri,

Please hold on reviewing this patch. I may post a update later.

Thanks
Hangbin
Hangbin Liu Feb. 28, 2019, 1:06 p.m. UTC | #2
Hi Jiri,

There are two parts I want to update in this patch. Would you please help
have a look at first?

On Wed, 30 Jan 2019 at 18:02, Hangbin Liu <liuhangbin@gmail.com> wrote:
>
> 1) According to 6.4.12 of IEEE 802.1AX-2008, Figure 6-18, if the partner's sync
> bit is not set, the lacp port state should be set to EXPIRED.
>
> Function lacpdu_recv() makes the state of the lacp port to PORT_STATE_CURRENT
> when the teamd receives any valid lacp packet. Fix it by checking the parter's
> sync bit.
>
> 2) According to 6.4.15 of IEEE 802.1AX-2008, Figure 6-22, the state that the
> port is selected moves MUX state from DETACHED to ATTACHED.
>
> But ATTACHED sate does not mean that the port can send and receive user frames.
> COLLECTING_DISTRIBUTION state is the sate that the port can send and receive
> user frames. To move MUX state from ATTACHED to COLLECTING_DISTRIBUTION, the
> partner state should be sync as well as the port selected.
>
> In function lacp_port_actor_update(), only INFO_STATE_SYNCHRONIZATION should
> be set to the actor.state when the port is selected. INFO_STATE_COLLECTING and
> INFO_STATE_DISTRIBUTING should be set to false with ATTACHED mode and set to
> true when INFO_STATE_SYNCHRONIZATION of partner.state is set.
>
> In function lacp_port_should_be_{enabled, disabled}(), we also need to check
> the INFO_STATE_SYNCHRONIZATION bit of partner.state.
>
> Reported-by: Cisco Systems
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> ---
>  teamd/teamd_runner_lacp.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/teamd/teamd_runner_lacp.c b/teamd/teamd_runner_lacp.c
> index 7b8f0a7..5e763ad 100644
> --- a/teamd/teamd_runner_lacp.c
> +++ b/teamd/teamd_runner_lacp.c
> @@ -333,7 +333,8 @@ static int lacp_port_should_be_enabled(struct lacp_port *lacp_port)
>         struct lacp *lacp = lacp_port->lacp;
>
>         if (lacp_port_selected(lacp_port) &&
> -           lacp_port->agg_lead == lacp->selected_agg_lead)
> +           lacp_port->agg_lead == lacp->selected_agg_lead &&
> +           lacp_port->partner.state & INFO_STATE_SYNCHRONIZATION)

First is here. I want to enable the port when partner enters COLLECTING state.

In this patch, We will set state to "SYNC & COLLECTING & DISTRIBUTIN"
and enable port
when partner enter SYNCHRONIZATION state. But as talked in [1],

Source Peer
S:0 C:0 D:0 S:0 C:0 D:0
S:1 C:0 D:0 ->
<- S:1 C:0 D:0 ?> At this time, peer LACP members have been chosen
S:1 C:1 D:0 -> ?> Source port indicates that it is ready to accept traffic
<- S:1 C:1 D:0 ?> Destination port indicates that it is ready to accept traffic
S:1 C:1 D:1 -> ?> Source port indicates that it is ready to send traffic
<- S:1 C:1 D:1 ?> Destination port indicates that it is ready to send traffic

After we enabling port and start transmitting data, partner is in SYNC
state but not in
COLLECTING. There will be a moment that partner not ready to receive
data.  The data
will be dropped.

So enable port after partner ready to receive trafice(in COLLECTING
state) seems better.

>                 return true;
>         return false;
>  }
> @@ -343,7 +344,8 @@ static int lacp_port_should_be_disabled(struct lacp_port *lacp_port)
>         struct lacp *lacp = lacp_port->lacp;
>
>         if (!lacp_port_selected(lacp_port) ||
> -           lacp_port->agg_lead != lacp->selected_agg_lead)
> +           lacp_port->agg_lead != lacp->selected_agg_lead ||
> +           !(lacp_port->partner.state & INFO_STATE_SYNCHRONIZATION))
>                 return true;
>         return false;
>  }
> @@ -914,9 +916,13 @@ static void lacp_port_actor_update(struct lacp_port *lacp_port)
>         if (lacp_port->lacp->cfg.fast_rate)
>                 state |= INFO_STATE_LACP_TIMEOUT;
>         if (lacp_port_selected(lacp_port) &&
> -           lacp_port_agg_selected(lacp_port))
> -               state |= INFO_STATE_SYNCHRONIZATION |
> -                        INFO_STATE_COLLECTING | INFO_STATE_DISTRIBUTING;
> +           lacp_port_agg_selected(lacp_port)) {
> +               state |= INFO_STATE_SYNCHRONIZATION;
> +               state &= ~(INFO_STATE_COLLECTING | INFO_STATE_DISTRIBUTING);
> +               if (lacp_port->partner.state & INFO_STATE_SYNCHRONIZATION)

Same here,  should we set state to COLLECTING & DISTRIBUTING after
partner enters
INFO_STATE_COLLECTING state?

> +                       state |= INFO_STATE_COLLECTING |
> +                                INFO_STATE_DISTRIBUTING;
> +       }
>         if (lacp_port->state == PORT_STATE_EXPIRED)
>                 state |= INFO_STATE_EXPIRED;
>         if (lacp_port->state == PORT_STATE_DEFAULTED)
> @@ -1095,7 +1101,10 @@ static int lacpdu_recv(struct lacp_port *lacp_port)
>                         return err;
>         }
>
> -       err = lacp_port_set_state(lacp_port, PORT_STATE_CURRENT);
> +       if (lacp_port->partner.state & INFO_STATE_SYNCHRONIZATION)
> +               err = lacp_port_set_state(lacp_port, PORT_STATE_CURRENT);
> +       else
> +               err = lacp_port_set_state(lacp_port, PORT_STATE_EXPIRED);
>         if (err)
>                 return err;
>

The second question is about commit a8e7d65f64b ("teamd: lacp: "expired" port
 is not selectable, only "current""). Is there a line in the standard
that clarify we should
not select the port in PORT_STATE_EXPIRED state?

The reason I ask is, as my patch describe, we should enter EXPIRED state when
we receive lacp without sync bit, which means we will not select the
port based on
commit a8e7d65f64b, then in function lacp_port_actor_update() we will never set
local state to SYNCHRONIZATION. So we will never send lacp message with sync
bit before partner send sync. Cisco has an option 'no lacp
graceful-convergence',
which did a similar thing. With this option and my patch, libteam and
Cisco switch will
never go to COLLECTING & DISTRIBUTING as both sides are waiting for
the sync bit.

So, should we select port with EXPIRED state?

Thanks
Hangbin
Hangbin Liu March 7, 2019, 2:44 a.m. UTC | #3
On Thu, 28 Feb 2019 at 21:06, Hangbin Liu <liuhangbin@gmail.com> wrote:
> > diff --git a/teamd/teamd_runner_lacp.c b/teamd/teamd_runner_lacp.c
> > index 7b8f0a7..5e763ad 100644
> > --- a/teamd/teamd_runner_lacp.c
> > +++ b/teamd/teamd_runner_lacp.c
> > @@ -333,7 +333,8 @@ static int lacp_port_should_be_enabled(struct lacp_port *lacp_port)
> >         struct lacp *lacp = lacp_port->lacp;
> >
> >         if (lacp_port_selected(lacp_port) &&
> > -           lacp_port->agg_lead == lacp->selected_agg_lead)
> > +           lacp_port->agg_lead == lacp->selected_agg_lead &&
> > +           lacp_port->partner.state & INFO_STATE_SYNCHRONIZATION)
>
> First is here. I want to enable the port when partner enters COLLECTING state.
>
> In this patch, We will set state to "SYNC & COLLECTING & DISTRIBUTIN"
> and enable port
> when partner enter SYNCHRONIZATION state. But as talked in [1],
>
> Source Peer
> S:0 C:0 D:0 S:0 C:0 D:0
> S:1 C:0 D:0 ->
> <- S:1 C:0 D:0 ?> At this time, peer LACP members have been chosen
> S:1 C:1 D:0 -> ?> Source port indicates that it is ready to accept traffic
> <- S:1 C:1 D:0 ?> Destination port indicates that it is ready to accept traffic
> S:1 C:1 D:1 -> ?> Source port indicates that it is ready to send traffic
> <- S:1 C:1 D:1 ?> Destination port indicates that it is ready to send traffic
>
> After we enabling port and start transmitting data, partner is in SYNC
> state but not in
> COLLECTING. There will be a moment that partner not ready to receive
> data.  The data
> will be dropped.
>
> So enable port after partner ready to receive trafice(in COLLECTING
> state) seems better.

OK, based on IEEE standard 8021AX-2014, Figure 6-22

"the coupled control state machine does not wait for the Partner to signal
 that collection has started before enabling both collection and distribution."

So there is no need to wait for partner entering COLLECTING state
before enable slave.

> > @@ -914,9 +916,13 @@ static void lacp_port_actor_update(struct lacp_port *lacp_port)
> >         if (lacp_port->lacp->cfg.fast_rate)
> >                 state |= INFO_STATE_LACP_TIMEOUT;
> >         if (lacp_port_selected(lacp_port) &&
> > -           lacp_port_agg_selected(lacp_port))
> > -               state |= INFO_STATE_SYNCHRONIZATION |
> > -                        INFO_STATE_COLLECTING | INFO_STATE_DISTRIBUTING;
> > +           lacp_port_agg_selected(lacp_port)) {
> > +               state |= INFO_STATE_SYNCHRONIZATION;
> > +               state &= ~(INFO_STATE_COLLECTING | INFO_STATE_DISTRIBUTING);
> > +               if (lacp_port->partner.state & INFO_STATE_SYNCHRONIZATION)
>
> Same here,  should we set state to COLLECTING & DISTRIBUTING after
> partner enters
> INFO_STATE_COLLECTING state?

no

> The second question is about commit a8e7d65f64b ("teamd: lacp: "expired" port
>  is not selectable, only "current""). Is there a line in the standard
> that clarify we should
> not select the port in PORT_STATE_EXPIRED state?
>
> The reason I ask is, as my patch describe, we should enter EXPIRED state when
> we receive lacp without sync bit, which means we will not select the
> port based on
> commit a8e7d65f64b, then in function lacp_port_actor_update() we will never set
> local state to SYNCHRONIZATION. So we will never send lacp message with sync
> bit before partner send sync. Cisco has an option 'no lacp
> graceful-convergence',
> which did a similar thing. With this option and my patch, libteam and
> Cisco switch will
> never go to COLLECTING & DISTRIBUTING as both sides are waiting for
> the sync bit.

I reconsider this sync and found the corrent patch is incorrect.

The current patch logic looks like

INIT -> send and receive "E:1 G:1 A:1" - > EXPIRED -> recive "S:1 G:1
A:1" - CURRENT.

But if both sides are in EXPIRED state, they will only send and
receive "E:1 G:1 A:1" messages.
Both sides will never receive "S:1" bit and never go in CURRENT state.

The correct logic should be like
INIT -> port link up && lacp enable -> EXPIRED -> send "E:1 G:1 A:1"
-> receive "E:1 G:1 A:1" -> CURRENT -> send "S:1 C:1 D:1"
Which is the current libteam logic. I will remove the first part of my patch

Thanks
Hangbin
diff mbox series

Patch

diff --git a/teamd/teamd_runner_lacp.c b/teamd/teamd_runner_lacp.c
index 7b8f0a7..5e763ad 100644
--- a/teamd/teamd_runner_lacp.c
+++ b/teamd/teamd_runner_lacp.c
@@ -333,7 +333,8 @@  static int lacp_port_should_be_enabled(struct lacp_port *lacp_port)
 	struct lacp *lacp = lacp_port->lacp;
 
 	if (lacp_port_selected(lacp_port) &&
-	    lacp_port->agg_lead == lacp->selected_agg_lead)
+	    lacp_port->agg_lead == lacp->selected_agg_lead &&
+	    lacp_port->partner.state & INFO_STATE_SYNCHRONIZATION)
 		return true;
 	return false;
 }
@@ -343,7 +344,8 @@  static int lacp_port_should_be_disabled(struct lacp_port *lacp_port)
 	struct lacp *lacp = lacp_port->lacp;
 
 	if (!lacp_port_selected(lacp_port) ||
-	    lacp_port->agg_lead != lacp->selected_agg_lead)
+	    lacp_port->agg_lead != lacp->selected_agg_lead ||
+	    !(lacp_port->partner.state & INFO_STATE_SYNCHRONIZATION))
 		return true;
 	return false;
 }
@@ -914,9 +916,13 @@  static void lacp_port_actor_update(struct lacp_port *lacp_port)
 	if (lacp_port->lacp->cfg.fast_rate)
 		state |= INFO_STATE_LACP_TIMEOUT;
 	if (lacp_port_selected(lacp_port) &&
-	    lacp_port_agg_selected(lacp_port))
-		state |= INFO_STATE_SYNCHRONIZATION |
-			 INFO_STATE_COLLECTING | INFO_STATE_DISTRIBUTING;
+	    lacp_port_agg_selected(lacp_port)) {
+		state |= INFO_STATE_SYNCHRONIZATION;
+		state &= ~(INFO_STATE_COLLECTING | INFO_STATE_DISTRIBUTING);
+		if (lacp_port->partner.state & INFO_STATE_SYNCHRONIZATION)
+			state |= INFO_STATE_COLLECTING |
+				 INFO_STATE_DISTRIBUTING;
+	}
 	if (lacp_port->state == PORT_STATE_EXPIRED)
 		state |= INFO_STATE_EXPIRED;
 	if (lacp_port->state == PORT_STATE_DEFAULTED)
@@ -1095,7 +1101,10 @@  static int lacpdu_recv(struct lacp_port *lacp_port)
 			return err;
 	}
 
-	err = lacp_port_set_state(lacp_port, PORT_STATE_CURRENT);
+	if (lacp_port->partner.state & INFO_STATE_SYNCHRONIZATION)
+		err = lacp_port_set_state(lacp_port, PORT_STATE_CURRENT);
+	else
+		err = lacp_port_set_state(lacp_port, PORT_STATE_EXPIRED);
 	if (err)
 		return err;