diff mbox series

[iproute2] dcb: unblock mnl_socket_recvfrom if not message received

Message ID 20221019012008.11322-1-lanhao@huawei.com (mailing list archive)
State Accepted
Delegated to: Stephen Hemminger
Headers show
Series [iproute2] dcb: unblock mnl_socket_recvfrom if not message received | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Hao Lan Oct. 19, 2022, 1:20 a.m. UTC
From: Junxin Chen <chenjunxin1@huawei.com>

Currently, the dcb command sinks to the kernel through the netlink
to obtain information. However, if the kernel fails to obtain infor-
mation or is not processed, the dcb command is suspended.

For example, if we don't implement dcbnl_ops->ieee_getpfc in the
kernel, the command "dcb pfc show dev eth1" will be stuck and subsequent
commands cannot be executed.

This patch adds the NLM_F_ACK flag to the netlink in mnlu_msg_prepare
to ensure that the kernel responds to user requests.

After the problem is solved, the execution result is as follows:
$ dcb pfc show dev eth1
Attribute not found: Success

Fixes: 67033d1c1c ("Add skeleton of a new tool, dcb")
Signed-off-by: Junxin Chen <chenjunxin1@huawei.com>
---
 dcb/dcb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Ahern Oct. 20, 2022, 1:23 p.m. UTC | #1
[ cc Petr ]

always add authors of patches you are fixing.

On 10/18/22 7:20 PM, Hao Lan wrote:
> From: Junxin Chen <chenjunxin1@huawei.com>
> 
> Currently, the dcb command sinks to the kernel through the netlink
> to obtain information. However, if the kernel fails to obtain infor-
> mation or is not processed, the dcb command is suspended.
> 
> For example, if we don't implement dcbnl_ops->ieee_getpfc in the
> kernel, the command "dcb pfc show dev eth1" will be stuck and subsequent
> commands cannot be executed.
> 
> This patch adds the NLM_F_ACK flag to the netlink in mnlu_msg_prepare
> to ensure that the kernel responds to user requests.
> 
> After the problem is solved, the execution result is as follows:
> $ dcb pfc show dev eth1
> Attribute not found: Success
> 
> Fixes: 67033d1c1c ("Add skeleton of a new tool, dcb")
> Signed-off-by: Junxin Chen <chenjunxin1@huawei.com>
> ---
>  dcb/dcb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/dcb/dcb.c b/dcb/dcb.c
> index 8d75ab0a..a6f457fb 100644
> --- a/dcb/dcb.c
> +++ b/dcb/dcb.c
> @@ -156,7 +156,7 @@ static struct nlmsghdr *dcb_prepare(struct dcb *dcb, const char *dev,
>  	};
>  	struct nlmsghdr *nlh;
>  
> -	nlh = mnlu_msg_prepare(dcb->buf, nlmsg_type, NLM_F_REQUEST, &dcbm, sizeof(dcbm));
> +	nlh = mnlu_msg_prepare(dcb->buf, nlmsg_type, NLM_F_REQUEST | NLM_F_ACK, &dcbm, sizeof(dcbm));
>  	mnl_attr_put_strz(nlh, DCB_ATTR_IFNAME, dev);
>  	return nlh;
>  }
Stephen Hemminger Oct. 20, 2022, 10:59 p.m. UTC | #2
On Wed, 19 Oct 2022 09:20:08 +0800
Hao Lan <lanhao@huawei.com> wrote:

> From: Junxin Chen <chenjunxin1@huawei.com>
> 
> Currently, the dcb command sinks to the kernel through the netlink
> to obtain information. However, if the kernel fails to obtain infor-
> mation or is not processed, the dcb command is suspended.
> 
> For example, if we don't implement dcbnl_ops->ieee_getpfc in the
> kernel, the command "dcb pfc show dev eth1" will be stuck and subsequent
> commands cannot be executed.
> 
> This patch adds the NLM_F_ACK flag to the netlink in mnlu_msg_prepare
> to ensure that the kernel responds to user requests.
> 
> After the problem is solved, the execution result is as follows:
> $ dcb pfc show dev eth1
> Attribute not found: Success
> 
> Fixes: 67033d1c1c ("Add skeleton of a new tool, dcb")
> Signed-off-by: Junxin Chen <chenjunxin1@huawei.com>

Applied, fixed these two checkpatch warnings in original submission.

WARNING: Please use correct Fixes: style 'Fixes: <12 chars of sha1> ("<title line>")' - ie: 'Fixes: 67033d1c1c8a ("Add skeleton of a new tool, dcb")'
#74: 
Fixes: 67033d1c1c ("Add skeleton of a new tool, dcb")

WARNING: line length of 101 exceeds 100 columns
#89: FILE: dcb/dcb.c:159:
+	nlh = mnlu_msg_prepare(dcb->buf, nlmsg_type, NLM_F_REQUEST | NLM_F_ACK, &dcbm, sizeof(dcbm));

total: 0 errors, 2 warnings, 8 lines checked
Ido Schimmel Dec. 20, 2022, 1:34 p.m. UTC | #3
On Wed, Oct 19, 2022 at 09:20:08AM +0800, Hao Lan wrote:
> From: Junxin Chen <chenjunxin1@huawei.com>
> 
> Currently, the dcb command sinks to the kernel through the netlink
> to obtain information. However, if the kernel fails to obtain infor-
> mation or is not processed, the dcb command is suspended.
> 
> For example, if we don't implement dcbnl_ops->ieee_getpfc in the
> kernel, the command "dcb pfc show dev eth1" will be stuck and subsequent
> commands cannot be executed.
> 
> This patch adds the NLM_F_ACK flag to the netlink in mnlu_msg_prepare
> to ensure that the kernel responds to user requests.

The analysis is not correct: The kernel does reply, but the reply does not
contain the 'DCB_ATTR_IEEE_PFC' attribute, causing the dcb utility to block on
recvmsg(). Since you changed the utility to request an ACK you need to make
sure this ACK is processed before issuing another request. Please test the
following patch. I would like to post it tomorrow.

Thanks

commit 7b545308a2273a7fd26204688fa632ec1b4c0205
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Tue Dec 20 14:27:46 2022 +0200

    dcb: Do not leave ACKs in socket receive buffer
    
    Originally, the dcb utility only stopped receiving messages from a
    socket when it found the attribute it was looking for. Cited commit
    changed that, so that the utility will also stop when seeing an ACK
    (NLMSG_ERROR message), by setting the NLM_F_ACK flag on requests.
    
    This is problematic because it means a successful request will leave an
    ACK in the socket receive buffer, causing the next request to bail
    before reading its response.
    
    Fix that by not stopping when finding the required attribute in a
    response. Instead, stop on the subsequent ACK.
    
    Fixes: 84c036972659 ("dcb: unblock mnl_socket_recvfrom if not message received")
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>

diff --git a/dcb/dcb.c b/dcb/dcb.c
index 3ffa91d64d0d..9b996abac529 100644
--- a/dcb/dcb.c
+++ b/dcb/dcb.c
@@ -72,7 +72,7 @@ static int dcb_get_attribute_attr_ieee_cb(const struct nlattr *attr, void *data)
 
 	ga->payload = mnl_attr_get_payload(attr);
 	ga->payload_len = mnl_attr_get_payload_len(attr);
-	return MNL_CB_STOP;
+	return MNL_CB_OK;
 }
 
 static int dcb_get_attribute_attr_cb(const struct nlattr *attr, void *data)
@@ -126,7 +126,7 @@ static int dcb_set_attribute_attr_cb(const struct nlattr *attr, void *data)
 		return MNL_CB_ERROR;
 	}
 
-	return MNL_CB_STOP;
+	return MNL_CB_OK;
 }
 
 static int dcb_set_attribute_cb(const struct nlmsghdr *nlh, void *data)
diff mbox series

Patch

diff --git a/dcb/dcb.c b/dcb/dcb.c
index 8d75ab0a..a6f457fb 100644
--- a/dcb/dcb.c
+++ b/dcb/dcb.c
@@ -156,7 +156,7 @@  static struct nlmsghdr *dcb_prepare(struct dcb *dcb, const char *dev,
 	};
 	struct nlmsghdr *nlh;
 
-	nlh = mnlu_msg_prepare(dcb->buf, nlmsg_type, NLM_F_REQUEST, &dcbm, sizeof(dcbm));
+	nlh = mnlu_msg_prepare(dcb->buf, nlmsg_type, NLM_F_REQUEST | NLM_F_ACK, &dcbm, sizeof(dcbm));
 	mnl_attr_put_strz(nlh, DCB_ATTR_IFNAME, dev);
 	return nlh;
 }