diff mbox

[PATCHv2] libteam: do not destroy the ifinfo of current unregistered slave dev

Message ID CY1PR14MB02034018ECD07C1731380FCDB58E0@CY1PR14MB0203.namprd14.prod.outlook.com (mailing list archive)
State New
Headers show

Commit Message

Tiainen, Antti April 26, 2018, 7:43 p.m. UTC
The reason why it didn't crash was because in the first environment I tried was
because it got port removal event before dellink (they come to different sockets).
In other environment, it did crash.

Xin Long's patch does the job, but the port hangs there in ifinfo list and debug
output quite a long time. I'll send alternative patch for suggestion how to fix
this issue. It would not crash and it removes the deleted port device from ifinfo
list, but it will still need a NULL check for debug print for the extremely rare
situation that dellink event is lost due to full socket receive buffer.


From: Tiainen, Antti

Sent: Monday, April 23, 2018 3:01 PM
To: Xin Long; libteam@lists.fedorahosted.org; Jiri Pirko
Subject: Re: EXTERNAL: [PATCHv2] libteam: do not destroy the ifinfo of current unregistered slave dev

Strange, it doesn't crash for me if I try your example. But if I try with your patch, the deleted interface just hangs there in ifinfo list forever, even when change handlers are called.

If this is only a debug print issue, it cannot do simply NULL check in there, and print something (or nothing) for the interface that's completely gone? 

From: Xin Long <lucien.xin@gmail.com>

Sent: Sunday, April 22, 2018 2:18:11 PM
To: libteam@lists.fedorahosted.org; Jiri Pirko
Cc: Tiainen, Antti
Subject: EXTERNAL: [PATCHv2] libteam: do not destroy the ifinfo of current unregistered slave dev

When a dev is being unregistered and the corresp team port is being
removed, the ifinfo should be delayed to destroy when doing dellink
next time, because later on the dbg log still needs it to print the
ifname that is saved in this ifinfo:
  -1450: veth1: down 0Mbit HD <--

The similar process is also used on the team_port's destruction.

However, after Commit 046fb6ba0aec ("libteam: resynchronize ifinfo
after lost RTNLGRP_LINK notifications"), it would destroy all those
ifinfo with REMOVED flag, including the one of current unregistered
slave dev. It would cause a crash later when printing the dbg log.

An easy way to reproduce:

  # ip link add veth1 type veth peer name veth1_peer
  # sleep 5 && ip link del veth1 &
  # teamd -g -c '{"device":"team0","runner":{"name":"activebackup"},

This patch is to check !ifinfo->port before calling ifinfo_destroy in
ifinfo_destroy_removed and clear_changed in ifinfo_destroy_removed.
After this fix, the ifinfo of current unregistered slave dev will not
be destroyed this time until next time when processing IFINFO_CHANGE
flag for another ifinfo's change in check_call_change_handlers.

  - improve the changelog, including some typos Jiri noticed.

Fixes: 046fb6ba0aec ("libteam: resynchronize ifinfo after lost RTNLGRP_LINK notifications")
Signed-off-by: Xin Long <lucien.xin@gmail.com>

 libteam/ifinfo.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff mbox


diff --git a/libteam/ifinfo.c b/libteam/ifinfo.c
index 5c32a9c..d47c2bf 100644
--- a/libteam/ifinfo.c
+++ b/libteam/ifinfo.c
@@ -211,7 +211,8 @@  void ifinfo_clear_changed(struct team_handle *th)
         struct team_ifinfo *ifinfo;
         list_for_each_node_entry(ifinfo, &th->ifinfo_list, list)
-               clear_changed(ifinfo);
+               if (!ifinfo->port)
+                       clear_changed(ifinfo);
 static struct team_ifinfo *ifinfo_find_create(struct team_handle *th,
@@ -245,7 +246,7 @@  void ifinfo_destroy_removed(struct team_handle *th)
         struct team_ifinfo *ifinfo, *tmp;
         list_for_each_node_entry_safe(ifinfo, tmp, &th->ifinfo_list, list) {
-               if (is_changed(ifinfo, CHANGED_REMOVED))
+               if (is_changed(ifinfo, CHANGED_REMOVED) && !ifinfo->port)