Message ID | 20211127174530.3600237-1-idosch@idosch.org (mailing list archive) |
---|---|
Headers | show |
Series | ethtool: Add ability to flash and query transceiver modules' firmware | expand |
On Sat, 27 Nov 2021 19:45:26 +0200 Ido Schimmel wrote: > This patchset extends the ethtool netlink API to allow user space to > both flash transceiver modules' firmware and query the firmware > information (e.g., version, state). > > The main use case is CMIS compliant modules such as QSFP-DD. The CMIS > standard specifies the interfaces used for both operations. See section > 7.3.1 in revision 5.0 of the standard [1]. > > Despite the immediate use case being CMIS compliant modules, the user > interface is kept generic enough to accommodate future use cases, if > these arise. > > The purpose of this RFC is to solicit feedback on both the proposed user > interface and the device driver API which are described in detail in > patches #1 and #3. The netdevsim patches are for RFC purposes only. The > plan is to implement the CMIS functionality in common code (under lib/) > so that it can be shared by MAC drivers that will pass function pointers > to it in order to read and write from their modules EEPROM. > > ethtool(8) patches can be found here [2]. Immediate question I have is why not devlink. We purposefully moved FW flashing to devlink because I may take long, so doing it under rtnl_lock is really bad. Other advantages exist (like flashing non-Ethernet ports). Ethtool netlink already existed at the time. I think device flashing may also benefit from the infra you're adding.
On Mon, Nov 29, 2021 at 09:37:24AM -0800, Jakub Kicinski wrote: > On Sat, 27 Nov 2021 19:45:26 +0200 Ido Schimmel wrote: > > This patchset extends the ethtool netlink API to allow user space to > > both flash transceiver modules' firmware and query the firmware > > information (e.g., version, state). > > > > The main use case is CMIS compliant modules such as QSFP-DD. The CMIS > > standard specifies the interfaces used for both operations. See section > > 7.3.1 in revision 5.0 of the standard [1]. > > > > Despite the immediate use case being CMIS compliant modules, the user > > interface is kept generic enough to accommodate future use cases, if > > these arise. > > > > The purpose of this RFC is to solicit feedback on both the proposed user > > interface and the device driver API which are described in detail in > > patches #1 and #3. The netdevsim patches are for RFC purposes only. The > > plan is to implement the CMIS functionality in common code (under lib/) > > so that it can be shared by MAC drivers that will pass function pointers > > to it in order to read and write from their modules EEPROM. > > > > ethtool(8) patches can be found here [2]. > > Immediate question I have is why not devlink. We purposefully moved > FW flashing to devlink because I may take long, so doing it under > rtnl_lock is really bad. Other advantages exist (like flashing > non-Ethernet ports). Ethtool netlink already existed at the time. Note that ethtool (as userspace utility) can still provide the functionality even if it's implemented in devlink API; this is likely going to be the case for device EEPROM flash (ethtool -f). At the moment, there is a problem that not nearly every device capable of flashing implements devlink but that could be addressed by the "generic devlink" idea (a wrapper implementing selected parts of devlink API for devices without an actual devlink implementation). Michal
On Mon, Nov 29, 2021 at 09:37:24AM -0800, Jakub Kicinski wrote: > On Sat, 27 Nov 2021 19:45:26 +0200 Ido Schimmel wrote: > > This patchset extends the ethtool netlink API to allow user space to > > both flash transceiver modules' firmware and query the firmware > > information (e.g., version, state). > > > > The main use case is CMIS compliant modules such as QSFP-DD. The CMIS > > standard specifies the interfaces used for both operations. See section > > 7.3.1 in revision 5.0 of the standard [1]. > > > > Despite the immediate use case being CMIS compliant modules, the user > > interface is kept generic enough to accommodate future use cases, if > > these arise. > > > > The purpose of this RFC is to solicit feedback on both the proposed user > > interface and the device driver API which are described in detail in > > patches #1 and #3. The netdevsim patches are for RFC purposes only. The > > plan is to implement the CMIS functionality in common code (under lib/) > > so that it can be shared by MAC drivers that will pass function pointers > > to it in order to read and write from their modules EEPROM. > > > > ethtool(8) patches can be found here [2]. > > Immediate question I have is why not devlink. We purposefully moved > FW flashing to devlink because I may take long, so doing it under > rtnl_lock is really bad. Other advantages exist (like flashing > non-Ethernet ports). Ethtool netlink already existed at the time. > > I think device flashing may also benefit from the infra you're adding. The idea of asynchronous operations without holding RTNL is not that new. The cable test code does it, but clearly cable testing is likely network specific, unlike FW flashing. Andrew
> -----Original Message----- > From: Jakub Kicinski <kuba@kernel.org> > Sent: Monday, November 29, 2021 9:37 AM > To: Ido Schimmel <idosch@idosch.org> > Cc: netdev@vger.kernel.org; davem@davemloft.net; andrew@lunn.ch; > mkubecek@suse.cz; pali@kernel.org; Keller, Jacob E <jacob.e.keller@intel.com>; > vadimp@nvidia.com; mlxsw@nvidia.com; Ido Schimmel <idosch@nvidia.com> > Subject: Re: [RFC PATCH net-next 0/4] ethtool: Add ability to flash and query > transceiver modules' firmware > > On Sat, 27 Nov 2021 19:45:26 +0200 Ido Schimmel wrote: > > This patchset extends the ethtool netlink API to allow user space to > > both flash transceiver modules' firmware and query the firmware > > information (e.g., version, state). > > > > The main use case is CMIS compliant modules such as QSFP-DD. The CMIS > > standard specifies the interfaces used for both operations. See section > > 7.3.1 in revision 5.0 of the standard [1]. > > > > Despite the immediate use case being CMIS compliant modules, the user > > interface is kept generic enough to accommodate future use cases, if > > these arise. > > > > The purpose of this RFC is to solicit feedback on both the proposed user > > interface and the device driver API which are described in detail in > > patches #1 and #3. The netdevsim patches are for RFC purposes only. The > > plan is to implement the CMIS functionality in common code (under lib/) > > so that it can be shared by MAC drivers that will pass function pointers > > to it in order to read and write from their modules EEPROM. > > > > ethtool(8) patches can be found here [2]. > > Immediate question I have is why not devlink. We purposefully moved > FW flashing to devlink because I may take long, so doing it under > rtnl_lock is really bad. Other advantages exist (like flashing > non-Ethernet ports). Ethtool netlink already existed at the time. > > I think device flashing may also benefit from the infra you're adding. I also immediately thought of devlink here. Thanks, Jake
On Tue, 30 Nov 2021 00:50:41 +0100 Andrew Lunn wrote: > On Mon, Nov 29, 2021 at 09:37:24AM -0800, Jakub Kicinski wrote: > > On Sat, 27 Nov 2021 19:45:26 +0200 Ido Schimmel wrote: > > > This patchset extends the ethtool netlink API to allow user space to > > > both flash transceiver modules' firmware and query the firmware > > > information (e.g., version, state). > > > > > > The main use case is CMIS compliant modules such as QSFP-DD. The CMIS > > > standard specifies the interfaces used for both operations. See section > > > 7.3.1 in revision 5.0 of the standard [1]. > > > > > > Despite the immediate use case being CMIS compliant modules, the user > > > interface is kept generic enough to accommodate future use cases, if > > > these arise. > > > > > > The purpose of this RFC is to solicit feedback on both the proposed user > > > interface and the device driver API which are described in detail in > > > patches #1 and #3. The netdevsim patches are for RFC purposes only. The > > > plan is to implement the CMIS functionality in common code (under lib/) > > > so that it can be shared by MAC drivers that will pass function pointers > > > to it in order to read and write from their modules EEPROM. > > > > > > ethtool(8) patches can be found here [2]. > > > > Immediate question I have is why not devlink. We purposefully moved > > FW flashing to devlink because I may take long, so doing it under > > rtnl_lock is really bad. Other advantages exist (like flashing > > non-Ethernet ports). Ethtool netlink already existed at the time. > > > > I think device flashing may also benefit from the infra you're adding. > > The idea of asynchronous operations without holding RTNL is not that > new. The cable test code does it, but clearly cable testing is likely > network specific, unlike FW flashing. Right, I missed this is async. Presumably since there is a plan for a common module the chance of bugs sneaking in will be somewhat lower, but still flashing FW is flashing FW, would be great if we could align with device FW flashing as done via devlink.
On Mon, Nov 29, 2021 at 09:37:24AM -0800, Jakub Kicinski wrote: > On Sat, 27 Nov 2021 19:45:26 +0200 Ido Schimmel wrote: > > This patchset extends the ethtool netlink API to allow user space to > > both flash transceiver modules' firmware and query the firmware > > information (e.g., version, state). > > > > The main use case is CMIS compliant modules such as QSFP-DD. The CMIS > > standard specifies the interfaces used for both operations. See section > > 7.3.1 in revision 5.0 of the standard [1]. > > > > Despite the immediate use case being CMIS compliant modules, the user > > interface is kept generic enough to accommodate future use cases, if > > these arise. > > > > The purpose of this RFC is to solicit feedback on both the proposed user > > interface and the device driver API which are described in detail in > > patches #1 and #3. The netdevsim patches are for RFC purposes only. The > > plan is to implement the CMIS functionality in common code (under lib/) > > so that it can be shared by MAC drivers that will pass function pointers > > to it in order to read and write from their modules EEPROM. > > > > ethtool(8) patches can be found here [2]. > > Immediate question I have is why not devlink. We purposefully moved > FW flashing to devlink because I may take long, so doing it under > rtnl_lock is really bad. Other advantages exist (like flashing > non-Ethernet ports). Ethtool netlink already existed at the time. Device firmware flashing doesn't belong in devlink because of locking semantics. It belongs in devlink because you are updating the firmware of the device, which can instantiate multiple netdevs. For multi-port devices, it always seemed weird to tell users "choose some random port and use it for 'ethtool -f'". I remember being asked if the command needs to be run for all swp* netdevs (no). On the other hand, each netdev corresponds to a single transceiver module and each transceiver module corresponds to a single netdev (modulo split which is a user configuration). In addition, users are already dumping the EEPROM contents of transceiver modules via ethtool and also toggling their settings. Given the above, it's beyond me why we should tell users to use anything other than ethtool to update transceiver modules' firmware. Also, note that an important difference from the devlink implementation is that this mechanism is asynchronous, which allows the firmware on multiple modules to be updated simultaneously.
On Tue, Nov 30, 2021 at 10:36:02AM +0200, Ido Schimmel wrote: > On Mon, Nov 29, 2021 at 09:37:24AM -0800, Jakub Kicinski wrote: > > > > Immediate question I have is why not devlink. We purposefully moved > > FW flashing to devlink because I may take long, so doing it under > > rtnl_lock is really bad. Other advantages exist (like flashing > > non-Ethernet ports). Ethtool netlink already existed at the time. > > Device firmware flashing doesn't belong in devlink because of locking > semantics. It belongs in devlink because you are updating the firmware > of the device, which can instantiate multiple netdevs. For multi-port > devices, it always seemed weird to tell users "choose some random port > and use it for 'ethtool -f'". I remember being asked if the command > needs to be run for all swp* netdevs (no). > > On the other hand, each netdev corresponds to a single transceiver > module and each transceiver module corresponds to a single netdev > (modulo split which is a user configuration). Devlink also has abstraction for ports so it can be used so it is not necessarily a problem. > In addition, users are already dumping the EEPROM contents of > transceiver modules via ethtool and also toggling their settings. > > Given the above, it's beyond me why we should tell users to use anything > other than ethtool to update transceiver modules' firmware. As I already mentioned, we should distinguish between ethtool API and ethtool utility. It is possible to implement the flashing in devlink API and let both devlink and ethtool utilities use that API. I'm not saying ethtool API is a wrong choice, IMHO either option has its pros and cons. I'm just trying to point out that implementation in devlink API does not necessarily mean one cannot use the ethtool to use the feature. Michal
On Tue, Nov 30, 2021 at 09:54:26AM +0100, Michal Kubecek wrote: > On Tue, Nov 30, 2021 at 10:36:02AM +0200, Ido Schimmel wrote: > > On Mon, Nov 29, 2021 at 09:37:24AM -0800, Jakub Kicinski wrote: > > > > > > Immediate question I have is why not devlink. We purposefully moved > > > FW flashing to devlink because I may take long, so doing it under > > > rtnl_lock is really bad. Other advantages exist (like flashing > > > non-Ethernet ports). Ethtool netlink already existed at the time. > > > > Device firmware flashing doesn't belong in devlink because of locking > > semantics. It belongs in devlink because you are updating the firmware > > of the device, which can instantiate multiple netdevs. For multi-port > > devices, it always seemed weird to tell users "choose some random port > > and use it for 'ethtool -f'". I remember being asked if the command > > needs to be run for all swp* netdevs (no). > > > > On the other hand, each netdev corresponds to a single transceiver > > module and each transceiver module corresponds to a single netdev > > (modulo split which is a user configuration). > > Devlink also has abstraction for ports so it can be used so it is not > necessarily a problem. > > > In addition, users are already dumping the EEPROM contents of > > transceiver modules via ethtool and also toggling their settings. > > > > Given the above, it's beyond me why we should tell users to use anything > > other than ethtool to update transceiver modules' firmware. > > As I already mentioned, we should distinguish between ethtool API and > ethtool utility. It is possible to implement the flashing in devlink API > and let both devlink and ethtool utilities use that API. > > I'm not saying ethtool API is a wrong choice, IMHO either option has its > pros and cons. What are the cons of implementing it in ethtool? It seems that the only thing devlink has going for it is the fact that it supports devlink device firmware update API, but it cannot be used as-is and needs to be heavily extended (e.g., asynchronicity is a must, per-port as opposed to per-device). It doesn't support any transceiver module API, as opposed to ethtool. > I'm just trying to point out that implementation in devlink API does > not necessarily mean one cannot use the ethtool to use the feature. I agree it can be done, but the fact that something can be done doesn't mean it should be done. If I'm extending devlink with new uAPI, then I will add support for it in devlink(8) and not ethtool(8) and vice versa.
On Tue, 30 Nov 2021 11:46:48 +0200 Ido Schimmel wrote: > > As I already mentioned, we should distinguish between ethtool API and > > ethtool utility. It is possible to implement the flashing in devlink API > > and let both devlink and ethtool utilities use that API. > > > > I'm not saying ethtool API is a wrong choice, IMHO either option has its > > pros and cons. > > What are the cons of implementing it in ethtool? It seems that the only > thing devlink has going for it is the fact that it supports devlink > device firmware update API, but it cannot be used as-is and needs to be > heavily extended (e.g., asynchronicity is a must, per-port as opposed to > per-device). It doesn't support any transceiver module API, as opposed > to ethtool. The primary advantage is that we could hopefully share some of the infrastructure around versioning, A/B image selection, activation and error reporting. All those are universal firmware update problems. > > I'm just trying to point out that implementation in devlink API does > > not necessarily mean one cannot use the ethtool to use the feature. > > I agree it can be done, but the fact that something can be done doesn't > mean it should be done. If I'm extending devlink with new uAPI, then I > will add support for it in devlink(8) and not ethtool(8) and vice versa. I'm not dead set on SFP flashing being in devlink, I just think it's the right choice, but at the end of the day - your call. From my experience working with and on FW management in production (using devlink) I don't think that the "rest of the SFP API is in ethtool" motivation matters in practice. At least not in my environment. Upgrading firmware is a process that's more concerned with different device components than the functionality those devices actually provide. For a person writing FW update automation its better if they have one type of API to talk to. IOW nobody cares if e.g. the FW upgrade on a soundcard is via the sound API. When automation gets more complex (again versioning, checking if there is degradation and FW has to be re-applied, checking if upgrades can be live, or device has to be reset, power cycled, etc) plugging into a consistent API is what matters most.
From: Ido Schimmel <idosch@nvidia.com> This patchset extends the ethtool netlink API to allow user space to both flash transceiver modules' firmware and query the firmware information (e.g., version, state). The main use case is CMIS compliant modules such as QSFP-DD. The CMIS standard specifies the interfaces used for both operations. See section 7.3.1 in revision 5.0 of the standard [1]. Despite the immediate use case being CMIS compliant modules, the user interface is kept generic enough to accommodate future use cases, if these arise. The purpose of this RFC is to solicit feedback on both the proposed user interface and the device driver API which are described in detail in patches #1 and #3. The netdevsim patches are for RFC purposes only. The plan is to implement the CMIS functionality in common code (under lib/) so that it can be shared by MAC drivers that will pass function pointers to it in order to read and write from their modules EEPROM. ethtool(8) patches can be found here [2]. [1] http://www.qsfp-dd.com/wp-content/uploads/2021/05/CMIS5p0.pdf [2] https://github.com/idosch/ethtool/tree/submit/module_fw_rfc Ido Schimmel (4): ethtool: Add ability to query transceiver modules' firmware information netdevsim: Implement support for ethtool_ops::get_module_fw_info ethtool: Add ability to flash transceiver modules' firmware netdevsim: Implement support for ethtool_ops::start_fw_flash_module Documentation/networking/ethtool-netlink.rst | 128 +++++++- drivers/net/netdevsim/ethtool.c | 164 ++++++++++ drivers/net/netdevsim/netdevsim.h | 10 + include/linux/ethtool.h | 109 +++++++ include/linux/ethtool_netlink.h | 9 + include/uapi/linux/ethtool.h | 18 + include/uapi/linux/ethtool_netlink.h | 49 +++ net/ethtool/module.c | 327 +++++++++++++++++++ net/ethtool/netlink.c | 17 + net/ethtool/netlink.h | 4 + 10 files changed, 833 insertions(+), 2 deletions(-)