Message ID | 20240424133023.4150624-5-danieller@nvidia.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Add ability to flash modules' firmware | expand |
On Wed, 24 Apr 2024 16:30:17 +0300 Danielle Ratson wrote: > + hdr = ethnl_bcastmsg_put(skb, ETHTOOL_MSG_MODULE_FW_FLASH_NTF); > + if (!hdr) > + goto err_skb; Do we want to blast it to all listeners or treat it as an async reply? We can save the seq and portid of the original requester and use reply, I think. > + ret = ethnl_fill_reply_header(skb, dev, > + ETHTOOL_A_MODULE_FW_FLASH_HEADER); > + if (ret < 0) > + goto err_skb; > + > + if (nla_put_u32(skb, ETHTOOL_A_MODULE_FW_FLASH_STATUS, status)) > + goto err_skb; > + > + if (status_msg && > + nla_put_string(skb, ETHTOOL_A_MODULE_FW_FLASH_STATUS_MSG, > + status_msg)) > + goto err_skb; > + > + if (nla_put_u64_64bit(skb, ETHTOOL_A_MODULE_FW_FLASH_DONE, done, > + ETHTOOL_A_MODULE_FW_FLASH_PAD)) nla_put_uint() > + goto err_skb; > + > + if (nla_put_u64_64bit(skb, ETHTOOL_A_MODULE_FW_FLASH_TOTAL, total, > + ETHTOOL_A_MODULE_FW_FLASH_PAD)) nla_put_uint() > + goto err_skb; > + > + genlmsg_end(skb, hdr); > + ethnl_multicast(skb, dev); > + return; > + > +err_skb: > + nlmsg_free(skb); > +} > + > +void ethnl_module_fw_flash_ntf_err(struct net_device *dev, > + char *err_msg, char *sub_err_msg) > +{ > + char status_msg[120]; > + > + if (sub_err_msg) > + sprintf(status_msg, "%s, %s.", err_msg, sub_err_msg); > + else > + sprintf(status_msg, "%s.", err_msg); Hm, printing in the dot, and assuming sizeof err_msg + sub_err < 116 is a bit surprising. But I guess you have a reason... Maybe pass them separately to ethnl_module_fw_flash_ntf() then you can nla_reserve() the right amount of space and sprintf() directly into the skb? > + ethnl_module_fw_flash_ntf(dev, ETHTOOL_MODULE_FW_FLASH_STATUS_ERROR, > + status_msg, 0, 0);
> -----Original Message----- > From: Jakub Kicinski <kuba@kernel.org> > Sent: Tuesday, 30 April 2024 6:12 > To: Danielle Ratson <danieller@nvidia.com> > Cc: netdev@vger.kernel.org; davem@davemloft.net; edumazet@google.com; > pabeni@redhat.com; corbet@lwn.net; linux@armlinux.org.uk; > sdf@google.com; kory.maincent@bootlin.com; > maxime.chevallier@bootlin.com; vladimir.oltean@nxp.com; > przemyslaw.kitszel@intel.com; ahmed.zaki@intel.com; > richardcochran@gmail.com; shayagr@amazon.com; > paul.greenwalt@intel.com; jiri@resnulli.us; linux-doc@vger.kernel.org; linux- > kernel@vger.kernel.org; mlxsw <mlxsw@nvidia.com>; Petr Machata > <petrm@nvidia.com>; Ido Schimmel <idosch@nvidia.com> > Subject: Re: [PATCH net-next v5 04/10] ethtool: Add flashing transceiver > modules' firmware notifications ability > > On Wed, 24 Apr 2024 16:30:17 +0300 Danielle Ratson wrote: > > + hdr = ethnl_bcastmsg_put(skb, > ETHTOOL_MSG_MODULE_FW_FLASH_NTF); > > + if (!hdr) > > + goto err_skb; > > Do we want to blast it to all listeners or treat it as an async reply? > We can save the seq and portid of the original requester and use reply, I > think. I am sorry, I am not sure I understood what you meant here... it should be an async reply, but not sure I understood your suggestion. Can you explain please? Thanks! > > > + ret = ethnl_fill_reply_header(skb, dev, > > + > ETHTOOL_A_MODULE_FW_FLASH_HEADER); > > + if (ret < 0) > > + goto err_skb; > > + > > + if (nla_put_u32(skb, ETHTOOL_A_MODULE_FW_FLASH_STATUS, > status)) > > + goto err_skb; > > + > > + if (status_msg && > > + nla_put_string(skb, > ETHTOOL_A_MODULE_FW_FLASH_STATUS_MSG, > > + status_msg)) > > + goto err_skb; > > + > > + if (nla_put_u64_64bit(skb, ETHTOOL_A_MODULE_FW_FLASH_DONE, > done, > > + ETHTOOL_A_MODULE_FW_FLASH_PAD)) > > nla_put_uint() > > > + goto err_skb; > > + > > + if (nla_put_u64_64bit(skb, ETHTOOL_A_MODULE_FW_FLASH_TOTAL, > total, > > + ETHTOOL_A_MODULE_FW_FLASH_PAD)) > > nla_put_uint() > > > + goto err_skb; > > + > > + genlmsg_end(skb, hdr); > > + ethnl_multicast(skb, dev); > > + return; > > + > > +err_skb: > > + nlmsg_free(skb); > > +} > > + > > +void ethnl_module_fw_flash_ntf_err(struct net_device *dev, > > + char *err_msg, char *sub_err_msg) { > > + char status_msg[120]; > > + > > + if (sub_err_msg) > > + sprintf(status_msg, "%s, %s.", err_msg, sub_err_msg); > > + else > > + sprintf(status_msg, "%s.", err_msg); > > Hm, printing in the dot, and assuming sizeof err_msg + sub_err < 116 is a bit > surprising. But I guess you have a reason... > > Maybe pass them separately to ethnl_module_fw_flash_ntf() then you can > nla_reserve() the right amount of space and sprintf() directly into the skb? I can get rid of the dot actually, would it be ok like that? > > > + ethnl_module_fw_flash_ntf(dev, > ETHTOOL_MODULE_FW_FLASH_STATUS_ERROR, > > + status_msg, 0, 0);
On Tue, 30 Apr 2024 18:11:18 +0000 Danielle Ratson wrote: > > Do we want to blast it to all listeners or treat it as an async reply? > > We can save the seq and portid of the original requester and use reply, I > > think. > > I am sorry, I am not sure I understood what you meant here... it > should be an async reply, but not sure I understood your suggestion. > > Can you explain please? Make sure you have read the netlink intro, it should help fill in some gaps I won't explicitly cover: https://docs.kernel.org/next/userspace-api/netlink/intro.html "True" notifications will have pid = 0 and seq = 0, while replies to commands have those fields populated based on the request. pid identifies the socket where the message should be delivered. ethnl_multicast() assumes that it's zero (since it's designed to work for notifications) and sends the message to all sockets subscribed to a multicast / notification group (ETHNL_MCGRP_MONITOR). So that's the background. What you're doing isn't incorrect but I think it'd be better if we didn't use the multicast group here, and sent the messages as a reply - just to the socket which requested the flashing. Still asynchronously, we just need to save the right pid and seq to use. Two reasons for this: 1) convenience, the user space socket won't have to subscribe to the multicast group 2) the multicast group is really intended for notifying about _configuration changes_ done to the system. If there is a daemon listening on that group, there's a very high chance it won't care about progress of the flashing. Maybe we can send a single notification that flashing has been completed but not "progress updates" I think it should work. > > > +void ethnl_module_fw_flash_ntf_err(struct net_device *dev, > > > + char *err_msg, char *sub_err_msg) { > > > + char status_msg[120]; > > > + > > > + if (sub_err_msg) > > > + sprintf(status_msg, "%s, %s.", err_msg, sub_err_msg); > > > + else > > > + sprintf(status_msg, "%s.", err_msg); > > > > Hm, printing in the dot, and assuming sizeof err_msg + sub_err < 116 is a bit > > surprising. But I guess you have a reason... > > > > Maybe pass them separately to ethnl_module_fw_flash_ntf() then you can > > nla_reserve() the right amount of space and sprintf() directly into the skb? > > I can get rid of the dot actually, would it be ok like that? It'd still be better to splice the two strings and the comma directly to the skb, rather than on the stack using a function which doesn't check the bounds of the buffer :S
On Tue, Apr 30, 2024 at 01:03:02PM -0700, Jakub Kicinski wrote: > On Tue, 30 Apr 2024 18:11:18 +0000 Danielle Ratson wrote: > > > Do we want to blast it to all listeners or treat it as an async reply? > > > We can save the seq and portid of the original requester and use reply, I > > > think. > > > > I am sorry, I am not sure I understood what you meant here... it > > should be an async reply, but not sure I understood your suggestion. > > > > Can you explain please? > > Make sure you have read the netlink intro, it should help fill in some > gaps I won't explicitly cover: > https://docs.kernel.org/next/userspace-api/netlink/intro.html > > "True" notifications will have pid = 0 and seq = 0, while replies to > commands have those fields populated based on the request. > > pid identifies the socket where the message should be delivered. > ethnl_multicast() assumes that it's zero (since it's designed to work > for notifications) and sends the message to all sockets subscribed to > a multicast / notification group (ETHNL_MCGRP_MONITOR). > > So that's the background. What you're doing isn't incorrect but I think > it'd be better if we didn't use the multicast group here, and sent the > messages as a reply - just to the socket which requested the flashing. > Still asynchronously, we just need to save the right pid and seq to use. > > Two reasons for this: > 1) convenience, the user space socket won't have to subscribe to > the multicast group > 2) the multicast group is really intended for notifying about > _configuration changes_ done to the system. If there is a daemon > listening on that group, there's a very high chance it won't care > about progress of the flashing. Maybe we can send a single > notification that flashing has been completed but not "progress > updates" > > I think it should work. We can try to use unicast, but the current design is influenced by devlink firmware flash (see __devlink_flash_update_notify()) and ethtool cable testing (see ethnl_cable_test_started() and ethnl_cable_test_finished()), both of which use multicast notifications although the latter does not update about progress. Do you want us to try the unicast approach or be consistent with the above examples?
On Wed, 1 May 2024 10:53:48 +0300 Ido Schimmel wrote: > We can try to use unicast, but the current design is influenced by > devlink firmware flash (see __devlink_flash_update_notify()) and ethtool > cable testing (see ethnl_cable_test_started() and > ethnl_cable_test_finished()), both of which use multicast notifications > although the latter does not update about progress. > > Do you want us to try the unicast approach or be consistent with the > above examples? We are charting a bit of a new territory here, you're right that the precedents point in the direction of multicast. The unicast is harder to get done on the kernel side (we should probably also check that the socket pid didn't get reused, stop sending the notifications when original socket gets closed?) It will require using pretty much all the pieces of advanced netlink infra we have, I'm happy to explain more, but I'll also understand if you prefer to stick to multicast.
diff --git a/net/ethtool/module.c b/net/ethtool/module.c index ceb575efc290..114a2ec986fe 100644 --- a/net/ethtool/module.c +++ b/net/ethtool/module.c @@ -5,6 +5,7 @@ #include "netlink.h" #include "common.h" #include "bitset.h" +#include "module_fw.h" struct module_req_info { struct ethnl_req_info base; @@ -158,3 +159,85 @@ const struct ethnl_request_ops ethnl_module_request_ops = { .set = ethnl_set_module, .set_ntf_cmd = ETHTOOL_MSG_MODULE_NTF, }; + +/* MODULE_FW_FLASH_NTF */ + +static void +ethnl_module_fw_flash_ntf(struct net_device *dev, + enum ethtool_module_fw_flash_status status, + const char *status_msg, u64 done, u64 total) +{ + struct sk_buff *skb; + void *hdr; + int ret; + + skb = genlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL); + if (!skb) + return; + + hdr = ethnl_bcastmsg_put(skb, ETHTOOL_MSG_MODULE_FW_FLASH_NTF); + if (!hdr) + goto err_skb; + + ret = ethnl_fill_reply_header(skb, dev, + ETHTOOL_A_MODULE_FW_FLASH_HEADER); + if (ret < 0) + goto err_skb; + + if (nla_put_u32(skb, ETHTOOL_A_MODULE_FW_FLASH_STATUS, status)) + goto err_skb; + + if (status_msg && + nla_put_string(skb, ETHTOOL_A_MODULE_FW_FLASH_STATUS_MSG, + status_msg)) + goto err_skb; + + if (nla_put_u64_64bit(skb, ETHTOOL_A_MODULE_FW_FLASH_DONE, done, + ETHTOOL_A_MODULE_FW_FLASH_PAD)) + goto err_skb; + + if (nla_put_u64_64bit(skb, ETHTOOL_A_MODULE_FW_FLASH_TOTAL, total, + ETHTOOL_A_MODULE_FW_FLASH_PAD)) + goto err_skb; + + genlmsg_end(skb, hdr); + ethnl_multicast(skb, dev); + return; + +err_skb: + nlmsg_free(skb); +} + +void ethnl_module_fw_flash_ntf_err(struct net_device *dev, + char *err_msg, char *sub_err_msg) +{ + char status_msg[120]; + + if (sub_err_msg) + sprintf(status_msg, "%s, %s.", err_msg, sub_err_msg); + else + sprintf(status_msg, "%s.", err_msg); + + ethnl_module_fw_flash_ntf(dev, ETHTOOL_MODULE_FW_FLASH_STATUS_ERROR, + status_msg, 0, 0); +} + +void ethnl_module_fw_flash_ntf_start(struct net_device *dev) +{ + ethnl_module_fw_flash_ntf(dev, ETHTOOL_MODULE_FW_FLASH_STATUS_STARTED, + NULL, 0, 0); +} + +void ethnl_module_fw_flash_ntf_complete(struct net_device *dev) +{ + ethnl_module_fw_flash_ntf(dev, ETHTOOL_MODULE_FW_FLASH_STATUS_COMPLETED, + NULL, 0, 0); +} + +void ethnl_module_fw_flash_ntf_in_progress(struct net_device *dev, u64 done, + u64 total) +{ + ethnl_module_fw_flash_ntf(dev, + ETHTOOL_MODULE_FW_FLASH_STATUS_IN_PROGRESS, + NULL, done, total); +} diff --git a/net/ethtool/module_fw.h b/net/ethtool/module_fw.h new file mode 100644 index 000000000000..e40eae442741 --- /dev/null +++ b/net/ethtool/module_fw.h @@ -0,0 +1,10 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#include <uapi/linux/ethtool.h> + +void ethnl_module_fw_flash_ntf_err(struct net_device *dev, + char *err_msg, char *sub_err_msg); +void ethnl_module_fw_flash_ntf_start(struct net_device *dev); +void ethnl_module_fw_flash_ntf_complete(struct net_device *dev); +void ethnl_module_fw_flash_ntf_in_progress(struct net_device *dev, u64 done, + u64 total);
Add progress notifications ability to user space while flashing modules' firmware by implementing the interface between the user space and the kernel. Signed-off-by: Danielle Ratson <danieller@nvidia.com> --- Notes: v2: * Increase err_msg length. net/ethtool/module.c | 83 +++++++++++++++++++++++++++++++++++++++++ net/ethtool/module_fw.h | 10 +++++ 2 files changed, 93 insertions(+) create mode 100644 net/ethtool/module_fw.h