Message ID | 20221019012008.11322-1-lanhao@huawei.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Stephen Hemminger |
Headers | show |
Series | [iproute2] dcb: unblock mnl_socket_recvfrom if not message received | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
[ cc Petr ] always add authors of patches you are fixing. On 10/18/22 7:20 PM, Hao Lan wrote: > From: Junxin Chen <chenjunxin1@huawei.com> > > Currently, the dcb command sinks to the kernel through the netlink > to obtain information. However, if the kernel fails to obtain infor- > mation or is not processed, the dcb command is suspended. > > For example, if we don't implement dcbnl_ops->ieee_getpfc in the > kernel, the command "dcb pfc show dev eth1" will be stuck and subsequent > commands cannot be executed. > > This patch adds the NLM_F_ACK flag to the netlink in mnlu_msg_prepare > to ensure that the kernel responds to user requests. > > After the problem is solved, the execution result is as follows: > $ dcb pfc show dev eth1 > Attribute not found: Success > > Fixes: 67033d1c1c ("Add skeleton of a new tool, dcb") > Signed-off-by: Junxin Chen <chenjunxin1@huawei.com> > --- > dcb/dcb.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/dcb/dcb.c b/dcb/dcb.c > index 8d75ab0a..a6f457fb 100644 > --- a/dcb/dcb.c > +++ b/dcb/dcb.c > @@ -156,7 +156,7 @@ static struct nlmsghdr *dcb_prepare(struct dcb *dcb, const char *dev, > }; > struct nlmsghdr *nlh; > > - nlh = mnlu_msg_prepare(dcb->buf, nlmsg_type, NLM_F_REQUEST, &dcbm, sizeof(dcbm)); > + nlh = mnlu_msg_prepare(dcb->buf, nlmsg_type, NLM_F_REQUEST | NLM_F_ACK, &dcbm, sizeof(dcbm)); > mnl_attr_put_strz(nlh, DCB_ATTR_IFNAME, dev); > return nlh; > }
On Wed, 19 Oct 2022 09:20:08 +0800 Hao Lan <lanhao@huawei.com> wrote: > From: Junxin Chen <chenjunxin1@huawei.com> > > Currently, the dcb command sinks to the kernel through the netlink > to obtain information. However, if the kernel fails to obtain infor- > mation or is not processed, the dcb command is suspended. > > For example, if we don't implement dcbnl_ops->ieee_getpfc in the > kernel, the command "dcb pfc show dev eth1" will be stuck and subsequent > commands cannot be executed. > > This patch adds the NLM_F_ACK flag to the netlink in mnlu_msg_prepare > to ensure that the kernel responds to user requests. > > After the problem is solved, the execution result is as follows: > $ dcb pfc show dev eth1 > Attribute not found: Success > > Fixes: 67033d1c1c ("Add skeleton of a new tool, dcb") > Signed-off-by: Junxin Chen <chenjunxin1@huawei.com> Applied, fixed these two checkpatch warnings in original submission. WARNING: Please use correct Fixes: style 'Fixes: <12 chars of sha1> ("<title line>")' - ie: 'Fixes: 67033d1c1c8a ("Add skeleton of a new tool, dcb")' #74: Fixes: 67033d1c1c ("Add skeleton of a new tool, dcb") WARNING: line length of 101 exceeds 100 columns #89: FILE: dcb/dcb.c:159: + nlh = mnlu_msg_prepare(dcb->buf, nlmsg_type, NLM_F_REQUEST | NLM_F_ACK, &dcbm, sizeof(dcbm)); total: 0 errors, 2 warnings, 8 lines checked
On Wed, Oct 19, 2022 at 09:20:08AM +0800, Hao Lan wrote: > From: Junxin Chen <chenjunxin1@huawei.com> > > Currently, the dcb command sinks to the kernel through the netlink > to obtain information. However, if the kernel fails to obtain infor- > mation or is not processed, the dcb command is suspended. > > For example, if we don't implement dcbnl_ops->ieee_getpfc in the > kernel, the command "dcb pfc show dev eth1" will be stuck and subsequent > commands cannot be executed. > > This patch adds the NLM_F_ACK flag to the netlink in mnlu_msg_prepare > to ensure that the kernel responds to user requests. The analysis is not correct: The kernel does reply, but the reply does not contain the 'DCB_ATTR_IEEE_PFC' attribute, causing the dcb utility to block on recvmsg(). Since you changed the utility to request an ACK you need to make sure this ACK is processed before issuing another request. Please test the following patch. I would like to post it tomorrow. Thanks commit 7b545308a2273a7fd26204688fa632ec1b4c0205 Author: Ido Schimmel <idosch@nvidia.com> Date: Tue Dec 20 14:27:46 2022 +0200 dcb: Do not leave ACKs in socket receive buffer Originally, the dcb utility only stopped receiving messages from a socket when it found the attribute it was looking for. Cited commit changed that, so that the utility will also stop when seeing an ACK (NLMSG_ERROR message), by setting the NLM_F_ACK flag on requests. This is problematic because it means a successful request will leave an ACK in the socket receive buffer, causing the next request to bail before reading its response. Fix that by not stopping when finding the required attribute in a response. Instead, stop on the subsequent ACK. Fixes: 84c036972659 ("dcb: unblock mnl_socket_recvfrom if not message received") Signed-off-by: Ido Schimmel <idosch@nvidia.com> diff --git a/dcb/dcb.c b/dcb/dcb.c index 3ffa91d64d0d..9b996abac529 100644 --- a/dcb/dcb.c +++ b/dcb/dcb.c @@ -72,7 +72,7 @@ static int dcb_get_attribute_attr_ieee_cb(const struct nlattr *attr, void *data) ga->payload = mnl_attr_get_payload(attr); ga->payload_len = mnl_attr_get_payload_len(attr); - return MNL_CB_STOP; + return MNL_CB_OK; } static int dcb_get_attribute_attr_cb(const struct nlattr *attr, void *data) @@ -126,7 +126,7 @@ static int dcb_set_attribute_attr_cb(const struct nlattr *attr, void *data) return MNL_CB_ERROR; } - return MNL_CB_STOP; + return MNL_CB_OK; } static int dcb_set_attribute_cb(const struct nlmsghdr *nlh, void *data)
diff --git a/dcb/dcb.c b/dcb/dcb.c index 8d75ab0a..a6f457fb 100644 --- a/dcb/dcb.c +++ b/dcb/dcb.c @@ -156,7 +156,7 @@ static struct nlmsghdr *dcb_prepare(struct dcb *dcb, const char *dev, }; struct nlmsghdr *nlh; - nlh = mnlu_msg_prepare(dcb->buf, nlmsg_type, NLM_F_REQUEST, &dcbm, sizeof(dcbm)); + nlh = mnlu_msg_prepare(dcb->buf, nlmsg_type, NLM_F_REQUEST | NLM_F_ACK, &dcbm, sizeof(dcbm)); mnl_attr_put_strz(nlh, DCB_ATTR_IFNAME, dev); return nlh; }