Message ID | 20220425034431.3161260-1-idosch@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | mlxsw: extend line card model by devices and info | expand |
Hello: This series was applied to netdev/net-next.git (master) by David S. Miller <davem@davemloft.net>: On Mon, 25 Apr 2022 06:44:20 +0300 you wrote: > Jiri says: > > This patchset is extending the line card model by three items: > 1) line card devices > 2) line card info > 3) line card device info > > [...] Here is the summary with links: - [net-next,01/11] devlink: introduce line card devices support https://git.kernel.org/netdev/net-next/c/8d92e4fbcf0f - [net-next,02/11] devlink: introduce line card info get message https://git.kernel.org/netdev/net-next/c/276910aecc6a - [net-next,03/11] devlink: introduce line card device info infrastructure https://git.kernel.org/netdev/net-next/c/28b2d1f1ac41 - [net-next,04/11] mlxsw: reg: Extend MDDQ by device_info https://git.kernel.org/netdev/net-next/c/798e2df5067c - [net-next,05/11] mlxsw: core_linecards: Probe provisioned line cards for devices and attach them https://git.kernel.org/netdev/net-next/c/8e2e10f65112 - [net-next,06/11] selftests: mlxsw: Check devices on provisioned line card https://git.kernel.org/netdev/net-next/c/5e2229891825 - [net-next,07/11] mlxsw: core_linecards: Expose HW revision and INI version https://git.kernel.org/netdev/net-next/c/3b37130f4855 - [net-next,08/11] selftests: mlxsw: Check line card info on provisioned line card https://git.kernel.org/netdev/net-next/c/08682c9e58cd - [net-next,09/11] mlxsw: reg: Extend MDDQ device_info by FW version fields https://git.kernel.org/netdev/net-next/c/c38e9bf33812 - [net-next,10/11] mlxsw: core_linecards: Expose device FW version over device info https://git.kernel.org/netdev/net-next/c/e932b4bdbd7c - [net-next,11/11] selftests: mlxsw: Check device info on activated line card https://git.kernel.org/netdev/net-next/c/002defd576a3 You are awesome, thank you!
On Mon, 25 Apr 2022 06:44:20 +0300 Ido Schimmel wrote: > This patchset is extending the line card model by three items: > 1) line card devices > 2) line card info > 3) line card device info > > First three patches are introducing the necessary changes in devlink > core. > > Then, all three extensions are implemented in mlxsw alongside with > selftest. :/ what is a line card device? You must provide document what you're doing, this: .../networking/devlink/devlink-linecard.rst | 4 + is not enough. How many operations and attributes are you going to copy&paste? Is linking devlink instances into a hierarchy a better approach? Would you mind if I revert this? I don't understand why the line card patches are applied in 6h on Sunday night, especially that RFCv1 had quite a long discussion. But really any uAPI additions should warrant longer review time, IMHO.
On Mon, Apr 25, 2022 at 09:00:21AM -0700, Jakub Kicinski wrote: > On Mon, 25 Apr 2022 06:44:20 +0300 Ido Schimmel wrote: > > This patchset is extending the line card model by three items: > > 1) line card devices > > 2) line card info > > 3) line card device info > > > > First three patches are introducing the necessary changes in devlink > > core. > > > > Then, all three extensions are implemented in mlxsw alongside with > > selftest. > > :/ what is a line card device? You must provide document what you're > doing, this: > > .../networking/devlink/devlink-linecard.rst | 4 + > > is not enough. > > How many operations and attributes are you going to copy&paste? > > Is linking devlink instances into a hierarchy a better approach? In this particular case, these devices are gearboxes. They are running their own firmware and we want user space to be able to query and update the running firmware version. The idea (implemented in the next patchset) is to let these devices expose their own "component name", which can then be plugged into the existing flash command: $ devlink lc show pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 state active type 16x100G supported_types: 16x100G devices: device 0 flashable true component lc8_dev0 device 1 flashable false device 2 flashable false device 3 flashable false $ devlink dev flash pci/0000:01:00.0 file some_file.mfa2 component lc8_dev0 Registering a separate devlink instance for these devices sounds like an overkill to me. If you are not OK with a separate command (e.g., DEVLINK_CMD_LINECARD_INFO_GET), then extending DEVLINK_CMD_INFO_GET is also an option. We discussed this during internal review, but felt that the current approach is cleaner. > Would you mind if I revert this? I can't stop you, but keep in mind that it's already late here and that tomorrow I'm AFK (reserve duty) and won't be able to tag it. Jiri should be available to continue this discussion tomorrow morning, so probably best to wait for his feedback.
On Mon, 25 Apr 2022 22:39:57 +0300 Ido Schimmel wrote: > > :/ what is a line card device? You must provide document what you're > > doing, this: > > > > .../networking/devlink/devlink-linecard.rst | 4 + > > > > is not enough. > > > > How many operations and attributes are you going to copy&paste? > > > > Is linking devlink instances into a hierarchy a better approach? > > In this particular case, these devices are gearboxes. They are running > their own firmware and we want user space to be able to query and update > the running firmware version. Nothing too special, then, we don't create "devices" for every component of the system which can have a separate FW. That's where "components" are intended to be used.. > The idea (implemented in the next patchset) is to let these devices > expose their own "component name", which can then be plugged into the > existing flash command: > > $ devlink lc show pci/0000:01:00.0 lc 8 > pci/0000:01:00.0: > lc 8 state active type 16x100G > supported_types: > 16x100G > devices: > device 0 flashable true component lc8_dev0 > device 1 flashable false > device 2 flashable false > device 3 flashable false > $ devlink dev flash pci/0000:01:00.0 file some_file.mfa2 component lc8_dev0 IDK if it's just me or this assumes deep knowledge of the system. I don't understand why we need to list devices 1-3 at all. And they don't even have names. No information is exposed. There are many components on any networking device, including plenty 40G-R4 -> 25G-R1 gearboxes out there. > Registering a separate devlink instance for these devices sounds like an > overkill to me. If you are not OK with a separate command (e.g., > DEVLINK_CMD_LINECARD_INFO_GET), then extending DEVLINK_CMD_INFO_GET is > also an option. We discussed this during internal review, but felt that > the current approach is cleaner. I don't know what you have queued, so if you don't need a full devlink instance (IOW line cards won't need more individual config) that's fine. For just FW flashing you can list many devices and update the components... no need to introduce new objects or uAPI. > > Would you mind if I revert this? > > I can't stop you, but keep in mind that it's already late here and that > tomorrow I'm AFK (reserve duty) and won't be able to tag it. Jiri should > be available to continue this discussion tomorrow morning, so probably > best to wait for his feedback. Sure, no rush.
Mon, Apr 25, 2022 at 09:39:57PM CEST, idosch@idosch.org wrote: >On Mon, Apr 25, 2022 at 09:00:21AM -0700, Jakub Kicinski wrote: >> On Mon, 25 Apr 2022 06:44:20 +0300 Ido Schimmel wrote: >> > This patchset is extending the line card model by three items: >> > 1) line card devices >> > 2) line card info >> > 3) line card device info >> > >> > First three patches are introducing the necessary changes in devlink >> > core. >> > >> > Then, all three extensions are implemented in mlxsw alongside with >> > selftest. >> >> :/ what is a line card device? You must provide document what you're >> doing, this: >> >> .../networking/devlink/devlink-linecard.rst | 4 + >> >> is not enough. >> >> How many operations and attributes are you going to copy&paste? >> >> Is linking devlink instances into a hierarchy a better approach? > >In this particular case, these devices are gearboxes. They are running >their own firmware and we want user space to be able to query and update >the running firmware version. > >The idea (implemented in the next patchset) is to let these devices >expose their own "component name", which can then be plugged into the >existing flash command: > > $ devlink lc show pci/0000:01:00.0 lc 8 > pci/0000:01:00.0: > lc 8 state active type 16x100G > supported_types: > 16x100G > devices: > device 0 flashable true component lc8_dev0 > device 1 flashable false > device 2 flashable false > device 3 flashable false > $ devlink dev flash pci/0000:01:00.0 file some_file.mfa2 component lc8_dev0 > >Registering a separate devlink instance for these devices sounds like an It is not a separate devlink device, not removetely. A devlink device is attached to some bus on the host, it contains entities like netdevices, etc. Line card devices, on contrary, are accessible over ASIC FW interface, they reside on line cards. ASIC FW is using build-in SDK to communicate with them. There is really nothing to expose, except for the face they are there, with some FW version and later on (follow-up patchset) to be able to flash FW on them. It's a gearbox. I found it odd to name it gearbox as in theory, there might be some other single purpose device on the line card of other type than gearbox. Therefore, "device". I admit it is a bit misleading. Idea for a better name? >overkill to me. If you are not OK with a separate command (e.g., >DEVLINK_CMD_LINECARD_INFO_GET), then extending DEVLINK_CMD_INFO_GET is >also an option. We discussed this during internal review, but felt that We would need to add all the line card hierarchy into info_get. That would look a bit odd to me. >the current approach is cleaner. > >> Would you mind if I revert this? Let's see what you need to change and change it in place, if possible. > >I can't stop you, but keep in mind that it's already late here and that >tomorrow I'm AFK (reserve duty) and won't be able to tag it. Jiri should >be available to continue this discussion tomorrow morning, so probably >best to wait for his feedback.
Mon, Apr 25, 2022 at 09:52:18PM CEST, kuba@kernel.org wrote: >On Mon, 25 Apr 2022 22:39:57 +0300 Ido Schimmel wrote: >> > :/ what is a line card device? You must provide document what you're >> > doing, this: >> > >> > .../networking/devlink/devlink-linecard.rst | 4 + >> > >> > is not enough. >> > >> > How many operations and attributes are you going to copy&paste? >> > >> > Is linking devlink instances into a hierarchy a better approach? >> >> In this particular case, these devices are gearboxes. They are running >> their own firmware and we want user space to be able to query and update >> the running firmware version. > >Nothing too special, then, we don't create "devices" for every >component of the system which can have a separate FW. That's where >"components" are intended to be used.. * Sure, that is why I re-used components :) But you have to somehow link the component to the particular gearbox on particular line card. Say, you need to flash GB on line card 8. This is basically providing a way to expose this relationship to user. Also, the "lc info" shows the FW version for gearboxes. As Ido mentioned, the GB versions could be listed in "devlink dev info" in theory. But then, you need to somehow expose the relationship with line card as well. I don't see any simpler iface than this. > >> The idea (implemented in the next patchset) is to let these devices >> expose their own "component name", which can then be plugged into the >> existing flash command: >> >> $ devlink lc show pci/0000:01:00.0 lc 8 >> pci/0000:01:00.0: >> lc 8 state active type 16x100G >> supported_types: >> 16x100G >> devices: >> device 0 flashable true component lc8_dev0 >> device 1 flashable false >> device 2 flashable false >> device 3 flashable false >> $ devlink dev flash pci/0000:01:00.0 file some_file.mfa2 component lc8_dev0 > >IDK if it's just me or this assumes deep knowledge of the system. >I don't understand why we need to list devices 1-3 at all. And they >don't even have names. No information is exposed. There are 4 gearboxes on the line card. They share the same flash. So if you flash gearbox 0, the rest will use the same FW. I'm exposing them for the sake of completeness. Also, the interface needs to be designed as a list anyway, as different line cards may have separate flash per gearbox. What's is the harm in exposing devices 1-3? If you insist, we can hide them. > >There are many components on any networking device, including plenty >40G-R4 -> 25G-R1 gearboxes out there. > >> Registering a separate devlink instance for these devices sounds like an >> overkill to me. If you are not OK with a separate command (e.g., >> DEVLINK_CMD_LINECARD_INFO_GET), then extending DEVLINK_CMD_INFO_GET is >> also an option. We discussed this during internal review, but felt that >> the current approach is cleaner. > >I don't know what you have queued, so if you don't need a full devlink >instance (IOW line cards won't need more individual config) that's fine. Yeah, incoparable, the devlink dev and line card device - gearbox. >For just FW flashing you can list many devices and update the >components... no need to introduce new objects or uAPI. Please see * above. > >> > Would you mind if I revert this? >> >> I can't stop you, but keep in mind that it's already late here and that >> tomorrow I'm AFK (reserve duty) and won't be able to tag it. Jiri should >> be available to continue this discussion tomorrow morning, so probably >> best to wait for his feedback. > >Sure, no rush.
> >> The idea (implemented in the next patchset) is to let these devices > >> expose their own "component name", which can then be plugged into the > >> existing flash command: > >> > >> $ devlink lc show pci/0000:01:00.0 lc 8 > >> pci/0000:01:00.0: > >> lc 8 state active type 16x100G > >> supported_types: > >> 16x100G > >> devices: > >> device 0 flashable true component lc8_dev0 > >> device 1 flashable false > >> device 2 flashable false > >> device 3 flashable false > >> $ devlink dev flash pci/0000:01:00.0 file some_file.mfa2 component lc8_dev0 > > > >IDK if it's just me or this assumes deep knowledge of the system. > >I don't understand why we need to list devices 1-3 at all. And they > >don't even have names. No information is exposed. > > There are 4 gearboxes on the line card. They share the same flash. So if > you flash gearbox 0, the rest will use the same FW. Is the flash big enough to hold four copies of the firmware? Listing all four devices gives you a path forward to allowing each device to have its own firmware. On the flip side, flashable false is not really true. I don't know this API at all, but are there any flags? Can you at least indicate it is shared? You could then have an output something like: device 0 flashable true shared component lc8_dev0 device 1 flashable true shared component lc8_dev0 device 2 flashable true shared component lc8_dev0 device 3 flashable true shared component lc8_dev0 so you get to see the sharing relationship. Andrew
> It is not a separate devlink device, not removetely. A devlink device is > attached to some bus on the host, it contains entities like netdevices, > etc. > > Line card devices, on contrary, are accessible over ASIC FW interface, > they reside on line cards. ASIC FW is using build-in SDK to communicate > with them. There is really nothing to expose, except for the face they > are there, with some FW version and later on (follow-up patchset) to be > able to flash FW on them. But isn't this just an implementation detail? Say the flash was directly accessible to the host? It is just another mtd devices? The gearbox is just another bunch of MMIO registers. You can access the SFP socket via a host i2c bus, etc. More of a SoC like implementation, which the enterprise routers are like. This is a completely different set of implementation details, but i still have the same basic building blocks. Should it look the same, and the implementation details are hidden, or do you want to expose your implementation details? Andrew
Tue, Apr 26, 2022 at 02:11:43PM CEST, andrew@lunn.ch wrote: >> >> The idea (implemented in the next patchset) is to let these devices >> >> expose their own "component name", which can then be plugged into the >> >> existing flash command: >> >> >> >> $ devlink lc show pci/0000:01:00.0 lc 8 >> >> pci/0000:01:00.0: >> >> lc 8 state active type 16x100G >> >> supported_types: >> >> 16x100G >> >> devices: >> >> device 0 flashable true component lc8_dev0 >> >> device 1 flashable false >> >> device 2 flashable false >> >> device 3 flashable false >> >> $ devlink dev flash pci/0000:01:00.0 file some_file.mfa2 component lc8_dev0 >> > >> >IDK if it's just me or this assumes deep knowledge of the system. >> >I don't understand why we need to list devices 1-3 at all. And they >> >don't even have names. No information is exposed. >> >> There are 4 gearboxes on the line card. They share the same flash. So if >> you flash gearbox 0, the rest will use the same FW. > >Is the flash big enough to hold four copies of the firmware? Listing One copy, all 4 are using it. >all four devices gives you a path forward to allowing each device to >have its own firmware. Yes. > >On the flip side, flashable false is not really true. I don't know >this API at all, but are there any flags? Can you at least indicate it >is shared? You could then have an output something like: Yes, that I can do. > > device 0 flashable true shared component lc8_dev0 > device 1 flashable true shared component lc8_dev0 > device 2 flashable true shared component lc8_dev0 > device 3 flashable true shared component lc8_dev0 > >so you get to see the sharing relationship. Agreed. > > Andrew
Tue, Apr 26, 2022 at 02:27:54PM CEST, andrew@lunn.ch wrote: >> It is not a separate devlink device, not removetely. A devlink device is >> attached to some bus on the host, it contains entities like netdevices, >> etc. >> >> Line card devices, on contrary, are accessible over ASIC FW interface, >> they reside on line cards. ASIC FW is using build-in SDK to communicate >> with them. There is really nothing to expose, except for the face they >> are there, with some FW version and later on (follow-up patchset) to be >> able to flash FW on them. > >But isn't this just an implementation detail? > >Say the flash was directly accessible to the host? It is just another >mtd devices? The gearbox is just another bunch of MMIO registers. You Not really, the ASIC FW is also not mtd device. ASIC FW exposes set of registers to work with the ASIC FW flash. The linecard gearbox implements the same set of registers, tunneled over MDDT register. Also, the gearbox is not accessable from the host. The ASIC FW is the only one able to access it. >can access the SFP socket via a host i2c bus, etc. More of a SoC like >implementation, which the enterprise routers are like. > >This is a completely different set of implementation details, but i >still have the same basic building blocks. Should it look the same, >and the implementation details are hidden, or do you want to expose >your implementation details? Well, I got your point. If the HW would be designed in the way the building blocks are exposed to the host, that would work. However, that is not the case here, unfortunatelly. > > Andrew
On Tue, 26 Apr 2022 08:57:15 +0200 Jiri Pirko wrote: > >> In this particular case, these devices are gearboxes. They are running > >> their own firmware and we want user space to be able to query and update > >> the running firmware version. > > > >Nothing too special, then, we don't create "devices" for every > >component of the system which can have a separate FW. That's where > >"components" are intended to be used.. > > * > Sure, that is why I re-used components :) Well, right, I guess you did reuse them a little :) > But you have to somehow link the component to the particular gearbox on > particular line card. Say, you need to flash GB on line card 8. This is > basically providing a way to expose this relationship to user. > > Also, the "lc info" shows the FW version for gearboxes. As Ido > mentioned, the GB versions could be listed in "devlink dev info" in > theory. But then, you need to somehow expose the relationship with > line card as well. Why would the automation which comes to update the firmware care at all where the component is? Humans can see what the component is by looking at the name. If we do need to know (*if*!) you can list FW components as a lc attribute, no need for new commands and objects. IMHO we should either keep lc objects simple and self contained or give them a devlink instance. Creating sub-objects from them is very worrying. If there is _any_ chance we'll need per-lc health reporters or sbs or params(
> Well, I got your point. If the HW would be designed in the way the > building blocks are exposed to the host, that would work. However, that > is not the case here, unfortunatelly. I'm with Jakub. It is the uAPI which matters here. It should look the same for a SoC style enterprise router and your discombobulated TOR router. How you talk to the different building blocks is an implementation detail. Andrew
Tue, Apr 26, 2022 at 02:41:30PM CEST, kuba@kernel.org wrote: >On Tue, 26 Apr 2022 08:57:15 +0200 Jiri Pirko wrote: >> >> In this particular case, these devices are gearboxes. They are running >> >> their own firmware and we want user space to be able to query and update >> >> the running firmware version. >> > >> >Nothing too special, then, we don't create "devices" for every >> >component of the system which can have a separate FW. That's where >> >"components" are intended to be used.. >> >> * >> Sure, that is why I re-used components :) > >Well, right, I guess you did reuse them a little :) I use them a lot. It is not visible in this patchset, but in the flashing follow-up patchset. > >> But you have to somehow link the component to the particular gearbox on >> particular line card. Say, you need to flash GB on line card 8. This is >> basically providing a way to expose this relationship to user. >> >> Also, the "lc info" shows the FW version for gearboxes. As Ido >> mentioned, the GB versions could be listed in "devlink dev info" in >> theory. But then, you need to somehow expose the relationship with >> line card as well. > >Why would the automation which comes to update the firmware care >at all where the component is? Humans can see what the component >is by looking at the name. The relationship-by-name sounds a bit fragile to me. The names of components are up to the individual drivers. > >If we do need to know (*if*!) you can list FW components as a lc >attribute, no need for new commands and objects. There is no new command for that, only one nested attribute which carries the device list added to the existing command. They are no new objects, they are just few nested values. > >IMHO we should either keep lc objects simple and self contained or >give them a devlink instance. Creating sub-objects from them is very Give them a devlink instance? I don't understand how. LC is not a separate device, far from that. That does not make any sense to me. >worrying. If there is _any_ chance we'll need per-lc health reporters >or sbs or params(
Tue, Apr 26, 2022 at 03:45:48PM CEST, andrew@lunn.ch wrote: >> Well, I got your point. If the HW would be designed in the way the >> building blocks are exposed to the host, that would work. However, that >> is not the case here, unfortunatelly. > >I'm with Jakub. It is the uAPI which matters here. It should look the >same for a SoC style enterprise router and your discombobulated TOR >router. How you talk to the different building blocks is an >implementation detail. It's not that simple. Take the gearbox for example. You say bunch of MDIO registers. ASIC FW has a custom SDK internally that is used to talk to the gearbox. The flash, you say expose by MTD, but there is no access to it directly from host. Can't be done. There are HW design limitations that are blocking your concept.
On Tue, 26 Apr 2022 16:00:10 +0200 Jiri Pirko wrote: > >> But you have to somehow link the component to the particular gearbox on > >> particular line card. Say, you need to flash GB on line card 8. This is > >> basically providing a way to expose this relationship to user. > >> > >> Also, the "lc info" shows the FW version for gearboxes. As Ido > >> mentioned, the GB versions could be listed in "devlink dev info" in > >> theory. But then, you need to somehow expose the relationship with > >> line card as well. > > > >Why would the automation which comes to update the firmware care > >at all where the component is? Humans can see what the component > >is by looking at the name. > > The relationship-by-name sounds a bit fragile to me. The names of > components are up to the individual drivers. I asked you how the automation will operate. You must answer questions if you want to have a discussion. Automation is the relevant part. You're not designing an interface for SDK users but for end users. > >If we do need to know (*if*!) you can list FW components as a lc > >attribute, no need for new commands and objects. > > There is no new command for that, only one nested attribute which > carries the device list added to the existing command. They are no new > objects, they are just few nested values. DEVLINK_CMD_LINECARD_INFO_GET > >IMHO we should either keep lc objects simple and self contained or > >give them a devlink instance. Creating sub-objects from them is very > > Give them a devlink instance? I don't understand how. LC is not a > separate device, far from that. That does not make any sense to me. You can put a name of another devlink instance as an attribute of a lc. See below. > >worrying. If there is _any_ chance we'll need per-lc health reporters > >or sbs or params(
> Does not make sense to me at all. Line cards are detachable PHY sets in > essence, very basic functionality. They does not have buffers, health > and params, I don't think so. Ido recently added support to ethtool to upgrade the flash in SFPs. They are far from simple devices. Some of the GPON ones have linux running in them, that you can log in to. The real question is, can you do everything you need to do via existing uAPIs like ethtool and hwmon. Andrew
On Tue, Apr 26, 2022 at 04:05:43PM +0200, Jiri Pirko wrote: > Tue, Apr 26, 2022 at 03:45:48PM CEST, andrew@lunn.ch wrote: > >> Well, I got your point. If the HW would be designed in the way the > >> building blocks are exposed to the host, that would work. However, that > >> is not the case here, unfortunatelly. > > > >I'm with Jakub. It is the uAPI which matters here. It should look the > >same for a SoC style enterprise router and your discombobulated TOR > >router. How you talk to the different building blocks is an > >implementation detail. > > It's not that simple. Take the gearbox for example. You say bunch of > MDIO registers. ASIC FW has a custom SDK internally that is used to > talk to the gearbox. > > The flash, you say expose by MTD, but there is no access to it directly > from host. Can't be done. There are HW design limitations that are > blocking your concept. The MTD API and your SDK API are abstractions. You give it a blob of data and ask it to write it to the storage. Somehow that happens. Does user space need to know MTD or an SDK is being used? Does user space care? I would expect the same uAPI for both, here is a firmware blob, write it to storage. The driver knows if it needs to use the MTD API or the SDK API, it is all abstracted away in the driver. Andrew
Tue, Apr 26, 2022 at 04:51:33PM CEST, kuba@kernel.org wrote: >On Tue, 26 Apr 2022 16:00:10 +0200 Jiri Pirko wrote: >> >> But you have to somehow link the component to the particular gearbox on >> >> particular line card. Say, you need to flash GB on line card 8. This is >> >> basically providing a way to expose this relationship to user. >> >> >> >> Also, the "lc info" shows the FW version for gearboxes. As Ido >> >> mentioned, the GB versions could be listed in "devlink dev info" in >> >> theory. But then, you need to somehow expose the relationship with >> >> line card as well. >> > >> >Why would the automation which comes to update the firmware care >> >at all where the component is? Humans can see what the component >> >is by looking at the name. >> >> The relationship-by-name sounds a bit fragile to me. The names of >> components are up to the individual drivers. > >I asked you how the automation will operate. You must answer questions >if you want to have a discussion. Automation is the relevant part. Automation, not sure. It would probably just see type of gearbox and flash it. Not sure I understand the question, perhaps you could explain? Plus, the possibility is to auto-flash the GB from driver directly. >You're not designing an interface for SDK users but for end users. Sure, that is the aim of this API. Human end user. That is why I wanted the user to see the relationships between devlink dev, line cards and the gearboxes on them. If you want to limit the visibility, sure, just tell me how. > >> >If we do need to know (*if*!) you can list FW components as a lc >> >attribute, no need for new commands and objects. >> >> There is no new command for that, only one nested attribute which >> carries the device list added to the existing command. They are no new >> objects, they are just few nested values. > >DEVLINK_CMD_LINECARD_INFO_GET Okay, that is not only to expose devices. That is also to expose info about linecards, like HW revision, INI version etc. Where else to put it? I can perhaps embed it into devlink dev info, but I thought separate command would be more suitable. object cmd, object info cmd. It is more clear I believe. > >> >IMHO we should either keep lc objects simple and self contained or >> >give them a devlink instance. Creating sub-objects from them is very >> >> Give them a devlink instance? I don't understand how. LC is not a >> separate device, far from that. That does not make any sense to me. > >You can put a name of another devlink instance as an attribute of a lc. >See below. > >> >worrying. If there is _any_ chance we'll need per-lc health reporters >> >or sbs or params(
Tue, Apr 26, 2022 at 05:24:41PM CEST, andrew@lunn.ch wrote: >> Does not make sense to me at all. Line cards are detachable PHY sets in >> essence, very basic functionality. They does not have buffers, health >> and params, I don't think so. > >Ido recently added support to ethtool to upgrade the flash in SFPs. >They are far from simple devices. Some of the GPON ones have linux >running in them, that you can log in to. > >The real question is, can you do everything you need to do via >existing uAPIs like ethtool and hwmon. No, it does not fit. Ethtool works with netdevices which are instantiated for separate port. The disconnection between what we can do with netdev as a handle and how the devices are modeled was the reason for devlink introduction in the past. > > Andrew
On Wed, 27 Apr 2022 09:35:34 +0200 Jiri Pirko wrote: > >> The relationship-by-name sounds a bit fragile to me. The names of > >> components are up to the individual drivers. > > > >I asked you how the automation will operate. You must answer questions > >if you want to have a discussion. Automation is the relevant part. > > Automation, not sure. It would probably just see type of gearbox and > flash it. Not sure I understand the question, perhaps you could explain? > Plus, the possibility is to auto-flash the GB from driver directly. > > > >You're not designing an interface for SDK users but for end users. > > Sure, that is the aim of this API. Human end user. That is why I wanted > the user to see the relationships between devlink dev, line cards and > the gearboxes on them. If you want to limit the visibility, sure, just > tell me how. Okay, we have completely different views on what the goals should be. Perhaps that explains the differences in the design. Of the three API levels (SDK, automation, human) I think automation is the only one that's interesting to us in Linux. SDK interfaces are necessarily too low level as they expose too much of internal details to standardize. Humans are good with dealing with uncertainty and diverse so there's no a good benchmark. The benchmark for automation is - can a machine use this API across different vendors to reliably achieve its goals. For FW info/flashing the goal is keeping the FW versions up to date. This is documented: https://www.kernel.org/doc/html/latest/networking/devlink/devlink-flash.html#firmware-version-management What would the pseudo code look like with "line cards" in the picture? Apply RFC1925 truth 12. > >> There is no new command for that, only one nested attribute which > >> carries the device list added to the existing command. They are no new > >> objects, they are just few nested values. > > > >DEVLINK_CMD_LINECARD_INFO_GET > > Okay, that is not only to expose devices. That is also to expose info > about linecards, like HW revision, INI version etc. Where else to put > it? I can perhaps embed it into devlink dev info, but I thought separate > command would be more suitable. object cmd, object info cmd. It is > more clear I believe. > >> If so, how does the user know if/when to flash it? > >> If not, where would you list it if devices nest is not the correct place? > > > >Let me mock up what I had in mind for you since it did not come thru > >in the explanation: > > > >$ devlink dev info show pci/0000:01:00.0 > > versions: > > fixed: > > hw.revision 0 > > lc2.hw.revision a > > lc8.hw.revision b > > running: > > ini.version 4 > > lc2.gearbox 1.1.3 > > lc8.gearbox 1.2.3 > > Would be rather: > > lc2.gearbox0 1.1.3 > lc2.gearbox1 1.2.4 I thought you said your gearboxes all the the same FW? Theoretically, yes. Theoretically, I can also have nested "line cards". > lc8.gearbox0 1.2.3 > > Okay, I see. So instead of having clear api with relationships and > clear human+machine readability we have squahed indexes into strings. > I fail to see the benefit, other than no-api-extension :/ On contrary. Show me the real life use for all the "clear api with relationships" and I'll shut up. I would not take falling back to physical (HW) hierarchy for the API design as a point of pride. Seems lazy if I'm completely honest. Someone else's HW may have a different hierarchy, and you're just forcing the automation engineer iterate over irrelevant structures ("devices"). My hunch is that automation will not want to deal with line cards separately, and flash the entire devices in one go to a tested and verified bundle blob provided by the vendor. If they do want to poke at line cards - the information is still there in what I described. > >$ devlink lc show pci/0000:01:00.0 lc 8 > >pci/0000:01:00.0: > > lc 8 state active type 16x100G > > supported_types: > > 16x100G > > versions: > > lc8.hw.revision (a) > > lc8.gearbox (1.2.3) > > > >Where the data in the brackets is optionally fetched thru the existing > >"dev info" API, but rendered together by the user space. > > Quite odd. I find it questionable to say at least to mix multiple > command netlink outputs into one output. Really? So we're going to be designing kernel APIs so that each message contains complete information and can't contain references now? > The processing of it would be a small nightmare considering the way > how the netlink message processing works in iproute2 :/
Wed, Apr 27, 2022 at 04:14:47PM CEST, kuba@kernel.org wrote: >On Wed, 27 Apr 2022 09:35:34 +0200 Jiri Pirko wrote: >> >> The relationship-by-name sounds a bit fragile to me. The names of >> >> components are up to the individual drivers. >> > >> >I asked you how the automation will operate. You must answer questions >> >if you want to have a discussion. Automation is the relevant part. >> >> Automation, not sure. It would probably just see type of gearbox and >> flash it. Not sure I understand the question, perhaps you could explain? >> Plus, the possibility is to auto-flash the GB from driver directly. >> >> >> >You're not designing an interface for SDK users but for end users. >> >> Sure, that is the aim of this API. Human end user. That is why I wanted >> the user to see the relationships between devlink dev, line cards and >> the gearboxes on them. If you want to limit the visibility, sure, just >> tell me how. > >Okay, we have completely different views on what the goals should be. >Perhaps that explains the differences in the design. I don't think so. I'm just a bit confused that you say "You're not designing an interface for SDK users but for *end users*" and now you say that primary is automation. Nevertheless, both are equally important and the iface should work for both or them, I believe. > >Of the three API levels (SDK, automation, human) I think automation >is the only one that's interesting to us in Linux. SDK interfaces are >necessarily too low level as they expose too much of internal details >to standardize. Humans are good with dealing with uncertainty and >diverse so there's no a good benchmark. > >The benchmark for automation is - can a machine use this API across >different vendors to reliably achieve its goals. For FW info/flashing >the goal is keeping the FW versions up to date. This is documented: > >https://www.kernel.org/doc/html/latest/networking/devlink/devlink-flash.html#firmware-version-management > >What would the pseudo code look like with "line cards" in the picture? >Apply RFC1925 truth 12. Something like this: $lc_count = array_size(devlink-lc-info[$handle]) for ($lcnum = 0; $lcnum < $lc_count; lcnum++): $dev_count = array_size(devlink-lc-info[$handle][$lcnum]) for ($devnum = 0; $devnum < $dev_count; $devnum++): # Get unique HW design identifier (gearbox id) $hw_id = devlink-lc-info[$handle][$lcnum][$devnum]['fw.psid'] # Find out which FW flash we want to use for this device $want_flash_vers = some-db-backed.lookup($hw_id, 'flash') # Update flash if necessary if $want_flash_vers != devlink-lc-info[$handle][$lcnum][$devnum]['fw']: $file = some-db-backed.download($hw_id, 'flash') $component = devlink-lc[$handle][$lcnum][$devnum]['component'] devlink-dev-flash($handle, $component, $file) devlink-reload() Clear indexes, not squashed somewhere in middle of string. > >> >> There is no new command for that, only one nested attribute which >> >> carries the device list added to the existing command. They are no new >> >> objects, they are just few nested values. >> > >> >DEVLINK_CMD_LINECARD_INFO_GET >> >> Okay, that is not only to expose devices. That is also to expose info >> about linecards, like HW revision, INI version etc. Where else to put >> it? I can perhaps embed it into devlink dev info, but I thought separate >> command would be more suitable. object cmd, object info cmd. It is >> more clear I believe. > >> >> If so, how does the user know if/when to flash it? >> >> If not, where would you list it if devices nest is not the correct place? >> > >> >Let me mock up what I had in mind for you since it did not come thru >> >in the explanation: >> > >> >$ devlink dev info show pci/0000:01:00.0 >> > versions: >> > fixed: >> > hw.revision 0 >> > lc2.hw.revision a >> > lc8.hw.revision b >> > running: >> > ini.version 4 >> > lc2.gearbox 1.1.3 >> > lc8.gearbox 1.2.3 >> >> Would be rather: >> >> lc2.gearbox0 1.1.3 >> lc2.gearbox1 1.2.4 > >I thought you said your gearboxes all the the same FW? >Theoretically, yes. Theoretically, I can also have nested "line cards". Well, yeah. I was under impresion that possibility of having multiple devices on the same LC is not close to 0. But I get your point. Let's try to figure out he iface as you want it: We will have devlink dev info extended to be able to provide info about the LC/gearbox. Let's work with same prefix "lcX." for all info related to line card X. First problem is, who is going to enforce this prefix. It is driver's responsibility to provide the string which is put out. The solution would be to have a helper similar to devlink_info_version_*_put() called devlink_info_lc_version_*_put() that would accept lc pointer and add the prefix. Does it make sense to you? We need 3 things: 1) current version of gearbox FW That is easy, we have it - "versions" (DEVLINK_ATTR_INFO_VERSION_* nested attrs). We can have multiple nests that would carry the versions for individiual line cards. Example: versions: fixed: hw.revision 0 lc2.hw.revision a lc8.hw.revision b running: ini.version 4 lc2.gearbox.fw.version 1.1.3 lc8.gearbox.fw.version 1.2.3 2) HW id (as you have it in your pseudocode), it is PSID in case of mlxsw. We already have PSID for ASIC: .... This should be also easy, as this is exposed as "fixed version" in a same way as previous example. Example: versions: fixed: lc2.gearbox.fw.psid XXX lc8.gearbox.fw.psid YYY 3) Component name This one is a bit tricky. It is not a version, so put it under "versions" does not make much sense. Also, there are more than one. Looking at how versions are done, multiple nests of attr type DEVLINK_ATTR_INFO_VERSION_* are put to the message. So we can introduce new attr DEVLINK_ATTR_INFO_FLASH_COMPONENT and put one per linecard/gearbox. Here arain we need some kind of helper to driver to call to put this into msg: devlink_info_lc_flash_component_put() Example: pci/0000:01:00.0: driver mlxsw_spectrum3 flash_components: lc2.gearbox.fw lc8.gearbox.fw Then the flashing of the gearbox will be done by: # devlink dev flash pci/0000:01:00.0 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 component lc2.gearbox.fw Maybe this would call for some sort of API between driver and devlink to register individial components. devlink.c can then sanitize the input value of the component according to the registered list. Even withot that I think this would be valuable anyway to let the user know what are the supported component names. What do you think? If you agree, I can jump into implementing this and you can feel free to revert this patchset. > >> lc8.gearbox0 1.2.3 >> >> Okay, I see. So instead of having clear api with relationships and >> clear human+machine readability we have squahed indexes into strings. >> I fail to see the benefit, other than no-api-extension :/ On contrary. > >Show me the real life use for all the "clear api with relationships" >and I'll shut up. See the pseudo-code above. > >I would not take falling back to physical (HW) hierarchy for the API >design as a point of pride. Seems lazy if I'm completely honest. I got that. I spent a lot of time to find a good abstraction though. >Someone else's HW may have a different hierarchy, and you're just >forcing the automation engineer iterate over irrelevant structures >("devices"). Well, "linecard device" could be anything on th LC, from gearbox to whatever. It should fit. But I get your point. > >My hunch is that automation will not want to deal with line cards >separately, and flash the entire devices in one go to a tested and >verified bundle blob provided by the vendor. If they do want to poke >at line cards - the information is still there in what I described. > >> >$ devlink lc show pci/0000:01:00.0 lc 8 >> >pci/0000:01:00.0: >> > lc 8 state active type 16x100G >> > supported_types: >> > 16x100G >> > versions: >> > lc8.hw.revision (a) >> > lc8.gearbox (1.2.3) >> > >> >Where the data in the brackets is optionally fetched thru the existing >> >"dev info" API, but rendered together by the user space. >> >> Quite odd. I find it questionable to say at least to mix multiple >> command netlink outputs into one output. > >Really? So we're going to be designing kernel APIs so that each message >contains complete information and can't contain references now? Can you give me an exapmple of devlink or any other iproute2 app cmd that actually does something similar to this? > >> The processing of it would be a small nightmare considering the way >> how the netlink message processing works in iproute2 :/
On Fri, 29 Apr 2022 13:51:33 +0200 Jiri Pirko wrote: > >Of the three API levels (SDK, automation, human) I think automation > >is the only one that's interesting to us in Linux. SDK interfaces are > >necessarily too low level as they expose too much of internal details > >to standardize. Humans are good with dealing with uncertainty and > >diverse so there's no a good benchmark. > > > >The benchmark for automation is - can a machine use this API across > >different vendors to reliably achieve its goals. For FW info/flashing > >the goal is keeping the FW versions up to date. This is documented: > > > >https://www.kernel.org/doc/html/latest/networking/devlink/devlink-flash.html#firmware-version-management > > > >What would the pseudo code look like with "line cards" in the picture? > >Apply RFC1925 truth 12. > > Something like this: > > $lc_count = array_size(devlink-lc-info[$handle]) > > for ($lcnum = 0; $lcnum < $lc_count; lcnum++): > $dev_count = array_size(devlink-lc-info[$handle][$lcnum]) > > for ($devnum = 0; $devnum < $dev_count; $devnum++): Here goes the iteration I complained about in my previous message. Tracking FW versions makes most sense at the level of a product (as in the unit of HW one can purchase from the system vendor). That integrates well with system tracking HW in the fleet. Product in your case will be a line card or populated chassis, I believe. > # Get unique HW design identifier (gearbox id) > $hw_id = devlink-lc-info[$handle][$lcnum][$devnum]['fw.psid'] 1) you can't use 'fw.psid' in generic logic, it's a Melvidia's invention 2) looking at your cover letter there's no fw.psid for the device reported, the automation will not work, Q.E.D. > # Find out which FW flash we want to use for this device > $want_flash_vers = some-db-backed.lookup($hw_id, 'flash') > > # Update flash if necessary > if $want_flash_vers != devlink-lc-info[$handle][$lcnum][$devnum]['fw']: > $file = some-db-backed.download($hw_id, 'flash') > $component = devlink-lc[$handle][$lcnum][$devnum]['component'] > devlink-dev-flash($handle, $component, $file) > > devlink-reload() > > Clear indexes, not squashed somewhere in middle of string. > > >I thought you said your gearboxes all the the same FW? > >Theoretically, yes. Theoretically, I can also have nested "line cards". > > Well, yeah. I was under impresion that possibility of having multiple > devices on the same LC is not close to 0. But I get your point. > > Let's try to figure out he iface as you want it: > We will have devlink dev info extended to be able to provide info > about the LC/gearbox. Let's work with same prefix "lcX." for all > info related to line card X. > > First problem is, who is going to enforce this prefix. It is driver's > responsibility to provide the string which is put out. The solution > would be to have a helper similar to devlink_info_version_*_put() > called devlink_info_lc_version_*_put() that would accept lc pointer and > add the prefix. Does it make sense to you? > > We need 3 things: > 1) current version of gearbox FW > That is easy, we have it - "versions" > (DEVLINK_ATTR_INFO_VERSION_* nested attrs). We can have multiple > nests that would carry the versions for individiual line cards. > Example: > versions: > fixed: > hw.revision 0 > lc2.hw.revision a > lc8.hw.revision b > running: > ini.version 4 > lc2.gearbox.fw.version 1.1.3 > lc8.gearbox.fw.version 1.2.3 > 2) HW id (as you have it in your pseudocode), it is PSID in case of > mlxsw. We already have PSID for ASIC: > .... > This should be also easy, as this is exposed as "fixed version" in a > same way as previous example. > Example: > versions: > fixed: > lc2.gearbox.fw.psid XXX > lc8.gearbox.fw.psid YYY > 3) Component name > This one is a bit tricky. It is not a version, so put it under > "versions" does not make much sense. > Also, there are more than one. Looking at how versions are done, > multiple nests of attr type DEVLINK_ATTR_INFO_VERSION_* are put to > the message. So we can introduce new attr DEVLINK_ATTR_INFO_FLASH_COMPONENT > and put one per linecard/gearbox. > Here arain we need some kind of helper to driver to call to put this > into msg: devlink_info_lc_flash_component_put() > Example: > pci/0000:01:00.0: > driver mlxsw_spectrum3 > flash_components: > lc2.gearbox.fw > lc8.gearbox.fw > > Then the flashing of the gearbox will be done by: > # devlink dev flash pci/0000:01:00.0 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 component lc2.gearbox.fw The main question to me is whether users will want to flash the entire device, or update line cards individually. What's inside mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2? Doesn't sound like it's FW just for a single gearbox? > >Really? So we're going to be designing kernel APIs so that each message > >contains complete information and can't contain references now? > > Can you give me an exapmple of devlink or any other iproute2 app cmd > that actually does something similar to this? Off the top of my head - doesn't ip has some caches for name resolution etc.? Either way current code in iproute2.git is hardly written on the stone tablets.
Fri, Apr 29, 2022 at 08:45:35PM CEST, kuba@kernel.org wrote: >On Fri, 29 Apr 2022 13:51:33 +0200 Jiri Pirko wrote: >> >Of the three API levels (SDK, automation, human) I think automation >> >is the only one that's interesting to us in Linux. SDK interfaces are >> >necessarily too low level as they expose too much of internal details >> >to standardize. Humans are good with dealing with uncertainty and >> >diverse so there's no a good benchmark. >> > >> >The benchmark for automation is - can a machine use this API across >> >different vendors to reliably achieve its goals. For FW info/flashing >> >the goal is keeping the FW versions up to date. This is documented: >> > >> >https://www.kernel.org/doc/html/latest/networking/devlink/devlink-flash.html#firmware-version-management >> > >> >What would the pseudo code look like with "line cards" in the picture? >> >Apply RFC1925 truth 12. >> >> Something like this: >> >> $lc_count = array_size(devlink-lc-info[$handle]) >> >> for ($lcnum = 0; $lcnum < $lc_count; lcnum++): >> $dev_count = array_size(devlink-lc-info[$handle][$lcnum]) >> >> for ($devnum = 0; $devnum < $dev_count; $devnum++): > >Here goes the iteration I complained about in my previous message. >Tracking FW versions makes most sense at the level of a product (as >in the unit of HW one can purchase from the system vendor). That >integrates well with system tracking HW in the fleet. Product in your >case will be a line card or populated chassis, I believe. Well, most probably, you are right. In theory, there migth de 2 types of devices/gearboxes on a single line card. I admit I find it unlikely now. > >> # Get unique HW design identifier (gearbox id) >> $hw_id = devlink-lc-info[$handle][$lcnum][$devnum]['fw.psid'] > >1) you can't use 'fw.psid' in generic logic, it's a Melvidia's invention >2) looking at your cover letter there's no fw.psid for the device > reported, the automation will not work, Q.E.D. We can make is a "symlink" to fw.hw_id or whatever. But that is not the point here. For ASIC FW, we currently have also fw.psid. > >> # Find out which FW flash we want to use for this device >> $want_flash_vers = some-db-backed.lookup($hw_id, 'flash') >> >> # Update flash if necessary >> if $want_flash_vers != devlink-lc-info[$handle][$lcnum][$devnum]['fw']: >> $file = some-db-backed.download($hw_id, 'flash') >> $component = devlink-lc[$handle][$lcnum][$devnum]['component'] >> devlink-dev-flash($handle, $component, $file) >> >> devlink-reload() >> >> Clear indexes, not squashed somewhere in middle of string. >> >> >I thought you said your gearboxes all the the same FW? >> >Theoretically, yes. Theoretically, I can also have nested "line cards". >> >> Well, yeah. I was under impresion that possibility of having multiple >> devices on the same LC is not close to 0. But I get your point. >> >> Let's try to figure out he iface as you want it: >> We will have devlink dev info extended to be able to provide info >> about the LC/gearbox. Let's work with same prefix "lcX." for all >> info related to line card X. >> >> First problem is, who is going to enforce this prefix. It is driver's >> responsibility to provide the string which is put out. The solution >> would be to have a helper similar to devlink_info_version_*_put() >> called devlink_info_lc_version_*_put() that would accept lc pointer and >> add the prefix. Does it make sense to you? >> >> We need 3 things: >> 1) current version of gearbox FW >> That is easy, we have it - "versions" >> (DEVLINK_ATTR_INFO_VERSION_* nested attrs). We can have multiple >> nests that would carry the versions for individiual line cards. >> Example: >> versions: >> fixed: >> hw.revision 0 >> lc2.hw.revision a >> lc8.hw.revision b >> running: >> ini.version 4 >> lc2.gearbox.fw.version 1.1.3 >> lc8.gearbox.fw.version 1.2.3 >> 2) HW id (as you have it in your pseudocode), it is PSID in case of >> mlxsw. We already have PSID for ASIC: >> .... >> This should be also easy, as this is exposed as "fixed version" in a >> same way as previous example. >> Example: >> versions: >> fixed: >> lc2.gearbox.fw.psid XXX >> lc8.gearbox.fw.psid YYY >> 3) Component name >> This one is a bit tricky. It is not a version, so put it under >> "versions" does not make much sense. >> Also, there are more than one. Looking at how versions are done, >> multiple nests of attr type DEVLINK_ATTR_INFO_VERSION_* are put to >> the message. So we can introduce new attr DEVLINK_ATTR_INFO_FLASH_COMPONENT >> and put one per linecard/gearbox. >> Here arain we need some kind of helper to driver to call to put this >> into msg: devlink_info_lc_flash_component_put() >> Example: >> pci/0000:01:00.0: >> driver mlxsw_spectrum3 >> flash_components: >> lc2.gearbox.fw >> lc8.gearbox.fw >> >> Then the flashing of the gearbox will be done by: >> # devlink dev flash pci/0000:01:00.0 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 component lc2.gearbox.fw > >The main question to me is whether users will want to flash the entire >device, or update line cards individually. I think it makes sense to update them individually. The versions are also reported individually. What's the benefit of not doing that. Also, how would you name the "group" component. Sounds odd to me. > >What's inside mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2? Doesn't >sound like it's FW just for a single gearbox? > >> >Really? So we're going to be designing kernel APIs so that each message >> >contains complete information and can't contain references now? >> >> Can you give me an exapmple of devlink or any other iproute2 app cmd >> that actually does something similar to this? > >Off the top of my head - doesn't ip has some caches for name resolution >etc.? Either way current code in iproute2.git is hardly written on the >stone tablets. Not sure, that is why I thought you might have an example. Anyway, I don't think it is important. I think that all 3 values exposed I described above can be just in devlink dev info.
On Fri, 29 Apr 2022 21:29:16 +0200 Jiri Pirko wrote: > >The main question to me is whether users will want to flash the entire > >device, or update line cards individually. > > I think it makes sense to update them individually. The versions are > also reported individually. Okay, but neither I want that, nor does it match what Ido described as the direction for mlxsw, quoting: The idea (implemented in the next patchset) is to let these devices expose their own "component name", which can then be plugged into the existing flash command: $ devlink lc show pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 state active type 16x100G supported_types: 16x100G devices: device 0 flashable true component lc8_dev0 device 1 flashable false device 2 flashable false device 3 flashable false $ devlink dev flash pci/0000:01:00.0 file some_file.mfa2 component lc8_dev0 Your "devices" are _not_ individually flashable. It seems natural for single-board devices like a NIC or a line card to have a single flash with all the images burned together. > What's the benefit of not doing that. As already mentioned in my previous reply the user will likely have a database of all their networking assets, and having to break them up further than the physical piece of gear they order from the supplier is a pain. Plus the vendor will likely also prefer to ship a single validated image rather than a blob for every board component with FW. > Also, how would you name the "group" component. Sounds odd to me. To flash the whole device we skip the component. > >What's inside mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2? Doesn't > >sound like it's FW just for a single gearbox? Please answer questions. I already complained about this once in this thread.
Sat, Apr 30, 2022 at 12:38:45AM CEST, kuba@kernel.org wrote: >On Fri, 29 Apr 2022 21:29:16 +0200 Jiri Pirko wrote: >> >The main question to me is whether users will want to flash the entire >> >device, or update line cards individually. >> >> I think it makes sense to update them individually. The versions are >> also reported individually. > >Okay, but neither I want that, nor does it match what Ido described as >the direction for mlxsw, quoting: > > The idea (implemented in the next patchset) is to let these devices > expose their own "component name", which can then be plugged into the > existing flash command: > > $ devlink lc show pci/0000:01:00.0 lc 8 > pci/0000:01:00.0: > lc 8 state active type 16x100G > supported_types: > 16x100G > devices: > device 0 flashable true component lc8_dev0 > device 1 flashable false > device 2 flashable false > device 3 flashable false > $ devlink dev flash pci/0000:01:00.0 file some_file.mfa2 component lc8_dev0 > >Your "devices" are _not_ individually flashable. It seems natural for >single-board devices like a NIC or a line card to have a single flash >with all the images burned together. Wait a second. I think that we don't understand each other. Currently, we have a single device to flash on a linecard, the gearbox. There is one file to flash it. So 1:1 between line card and file to flash. That is exactly as I described in the proposal. 1 component name per line card. > >> What's the benefit of not doing that. > >As already mentioned in my previous reply the user will likely have >a database of all their networking assets, and having to break them >up further than the physical piece of gear they order from the supplier >is a pain. Plus the vendor will likely also prefer to ship a single >validated image rather than a blob for every board component with FW. Depends on the vendor :) But it is hypothetical, let's see what the future brings. But I agree with you. > >> Also, how would you name the "group" component. Sounds odd to me. > >To flash the whole device we skip the component. Sec. I think these is a misunderstanding here. The component it what we already have in devlink dev flash. Quoting devlink-dev man page: devlink dev flash - write device's non-volatile memory. DEV - specifies the devlink device to write to. file PATH - Path to the file which will be written into device's flash. The path needs to be relative to one of the directories searched by the kernel firmware loaded, such as /lib/firmware. ----> component NAME - If device stores multiple firmware images in non- volatile memory, this parameter may be used to indicate which firmware image should be written. The value of NAME should match the component names from devlink dev info and may be driver-dependent. This is currently not used in devlink capable drivers. It is a concept taken from ethtool (I think you were the one that requested to take it, but my memory could be wrong). Anyway, the default is component NULL. In case of mlxsw, in that case the target is ASIC FW. Now I just want to use this component name to target individual line cards. I see it is a nice fit. Don't you think? I see that the manpage is mentioning "the component names from devlink dev info" which is not actually implemented, but exactly what I proposed. > >> >What's inside mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2? Doesn't >> >sound like it's FW just for a single gearbox? > >Please answer questions. I already complained about this once in >this thread. Sorry, I missed this one. The file IS a FW just for a SINGLE gearbox.
On Sat, 30 Apr 2022 08:27:35 +0200 Jiri Pirko wrote: > Now I just want to use this component name to target individual line > cards. I see it is a nice fit. Don't you think? Still on the fence. > I see that the manpage is mentioning "the component names from devlink dev info" > which is not actually implemented, but exactly what I proposed. How do you tie the line card to the component name? lc8_dev0 from the flashing example is not present in the lc info output. > >Please answer questions. I already complained about this once in > >this thread. > > Sorry, I missed this one. The file IS a FW just for a SINGLE gearbox. I see.
First of all, sorry for delay :/ Mon, May 02, 2022 at 04:39:33PM CEST, kuba@kernel.org wrote: >On Sat, 30 Apr 2022 08:27:35 +0200 Jiri Pirko wrote: >> Now I just want to use this component name to target individual line >> cards. I see it is a nice fit. Don't you think? > >Still on the fence. Why? > >> I see that the manpage is mentioning "the component names from devlink dev info" >> which is not actually implemented, but exactly what I proposed. > >How do you tie the line card to the component name? lc8_dev0 from >the flashing example is not present in the lc info output. Okay, I will move it there. Makes sense. > >> >Please answer questions. I already complained about this once in >> >this thread. >> >> Sorry, I missed this one. The file IS a FW just for a SINGLE gearbox. > >I see.
On Mon, 23 May 2022 11:42:07 +0200 Jiri Pirko wrote: > Mon, May 02, 2022 at 04:39:33PM CEST, kuba@kernel.org wrote: > >On Sat, 30 Apr 2022 08:27:35 +0200 Jiri Pirko wrote: > >> Now I just want to use this component name to target individual line > >> cards. I see it is a nice fit. Don't you think? > > > >Still on the fence. > > Why? IIRC my concern was mixing objects. We have component name coming from lc info, but then use it in dev flash. > >> I see that the manpage is mentioning "the component names from devlink dev info" > >> which is not actually implemented, but exactly what I proposed. > > > >How do you tie the line card to the component name? lc8_dev0 from > >the flashing example is not present in the lc info output. > > Okay, I will move it there. Makes sense. FWIW I think I meant my comment as a way to underline that what you argue for is not what's implemented (assuming your "not actually implemented" referred to the flashing). I was trying to send you back to the drawing board rather than break open a box of band-aides.
Mon, May 23, 2022 at 07:56:40PM CEST, kuba@kernel.org wrote: >On Mon, 23 May 2022 11:42:07 +0200 Jiri Pirko wrote: >> Mon, May 02, 2022 at 04:39:33PM CEST, kuba@kernel.org wrote: >> >On Sat, 30 Apr 2022 08:27:35 +0200 Jiri Pirko wrote: >> >> Now I just want to use this component name to target individual line >> >> cards. I see it is a nice fit. Don't you think? >> > >> >Still on the fence. >> >> Why? > >IIRC my concern was mixing objects. We have component name coming from >lc info, but then use it in dev flash. Sure. I considered that. The thing is, even if you put the lc component names to output of "devlink dev info", you would need to provide lc objects as well (somehow) - to contain the versions. But the component name is related to lc object listed in "devlink lc", so "devlink lc info" sounds line the correct place to put it. If you are concern about "devlink dev flash" using component name from "devlink lc info", I would rather introduce "devlink lc flash" so you have a match. But from what I see, I don't really see the necessity for this match. Do you? > >> >> I see that the manpage is mentioning "the component names from devlink dev info" >> >> which is not actually implemented, but exactly what I proposed. >> > >> >How do you tie the line card to the component name? lc8_dev0 from >> >the flashing example is not present in the lc info output. >> >> Okay, I will move it there. Makes sense. > >FWIW I think I meant my comment as a way to underline that what you >argue for is not what's implemented (assuming your "not actually >implemented" referred to the flashing). I was trying to send you back >to the drawing board rather than break open a box of band-aides. Sure, lets do this right, I don't want to band-aide anything...
Tue, May 24, 2022 at 08:46:38AM CEST, jiri@resnulli.us wrote: >Mon, May 23, 2022 at 07:56:40PM CEST, kuba@kernel.org wrote: >>On Mon, 23 May 2022 11:42:07 +0200 Jiri Pirko wrote: >>> Mon, May 02, 2022 at 04:39:33PM CEST, kuba@kernel.org wrote: >>> >On Sat, 30 Apr 2022 08:27:35 +0200 Jiri Pirko wrote: >>> >> Now I just want to use this component name to target individual line >>> >> cards. I see it is a nice fit. Don't you think? >>> > >>> >Still on the fence. >>> >>> Why? >> >>IIRC my concern was mixing objects. We have component name coming from >>lc info, but then use it in dev flash. > >Sure. I considered that. The thing is, even if you put the lc component >names to output of "devlink dev info", you would need to provide lc >objects as well (somehow) - to contain the versions. > >But the component name is related to lc object listed in "devlink lc", >so "devlink lc info" sounds line the correct place to put it. > >If you are concern about "devlink dev flash" using component name from >"devlink lc info", I would rather introduce "devlink lc flash" so you >have a match. But from what I see, I don't really see the necessity for >this match. Do you? Okay, we can eventually avoid using component name at all for now, considering one flash object per linecard (with possibility to extend by component later on). This would look like: $ devlink lc info pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 versions: fixed: hw.revision 0 fw.psid MT_0000000749 running: ini.version 4 fw 19.2010.1310 $ devlink lc flash pci/0000:01:00.0 lc 8 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 I have to admit I like this. We would reuse the existing DEVLINK_CMD_FLASH_UPDATE cmd and when DEVLINK_ATTR_LINECARD_INDEX attribute is present, we call the lc-flash op. How does this sound? > > >> >>> >> I see that the manpage is mentioning "the component names from devlink dev info" >>> >> which is not actually implemented, but exactly what I proposed. >>> > >>> >How do you tie the line card to the component name? lc8_dev0 from >>> >the flashing example is not present in the lc info output. >>> >>> Okay, I will move it there. Makes sense. >> >>FWIW I think I meant my comment as a way to underline that what you >>argue for is not what's implemented (assuming your "not actually >>implemented" referred to the flashing). I was trying to send you back >>to the drawing board rather than break open a box of band-aides. > >Sure, lets do this right, I don't want to band-aide anything...
On Tue, 24 May 2022 16:31:45 +0200 Jiri Pirko wrote: > >Sure. I considered that. The thing is, even if you put the lc component > >names to output of "devlink dev info", you would need to provide lc > >objects as well (somehow) - to contain the versions. > > > >But the component name is related to lc object listed in "devlink lc", > >so "devlink lc info" sounds line the correct place to put it. > > > >If you are concern about "devlink dev flash" using component name from > >"devlink lc info", I would rather introduce "devlink lc flash" so you > >have a match. But from what I see, I don't really see the necessity for > >this match. Do you? > > Okay, we can eventually avoid using component name at all for now, > considering one flash object per linecard (with possibility to extend by > component later on). This would look like: > > $ devlink lc info pci/0000:01:00.0 lc 8 > pci/0000:01:00.0: > lc 8 > versions: > fixed: > hw.revision 0 > fw.psid MT_0000000749 > running: > ini.version 4 > fw 19.2010.1310 > > $ devlink lc flash pci/0000:01:00.0 lc 8 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 > > I have to admit I like this. > We would reuse the existing DEVLINK_CMD_FLASH_UPDATE cmd and when > DEVLINK_ATTR_LINECARD_INDEX attribute is present, we call the lc-flash > op. How does this sound? We talked about this earlier in the thread, I think. If you need both info and flash per LC just make them a separate devlink instance and let them have all the objects they need. Then just put the instance name under lc info.
Tue, May 24, 2022 at 08:00:57PM CEST, kuba@kernel.org wrote: >On Tue, 24 May 2022 16:31:45 +0200 Jiri Pirko wrote: >> >Sure. I considered that. The thing is, even if you put the lc component >> >names to output of "devlink dev info", you would need to provide lc >> >objects as well (somehow) - to contain the versions. >> > >> >But the component name is related to lc object listed in "devlink lc", >> >so "devlink lc info" sounds line the correct place to put it. >> > >> >If you are concern about "devlink dev flash" using component name from >> >"devlink lc info", I would rather introduce "devlink lc flash" so you >> >have a match. But from what I see, I don't really see the necessity for >> >this match. Do you? >> >> Okay, we can eventually avoid using component name at all for now, >> considering one flash object per linecard (with possibility to extend by >> component later on). This would look like: >> >> $ devlink lc info pci/0000:01:00.0 lc 8 >> pci/0000:01:00.0: >> lc 8 >> versions: >> fixed: >> hw.revision 0 >> fw.psid MT_0000000749 >> running: >> ini.version 4 >> fw 19.2010.1310 >> >> $ devlink lc flash pci/0000:01:00.0 lc 8 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 >> >> I have to admit I like this. >> We would reuse the existing DEVLINK_CMD_FLASH_UPDATE cmd and when >> DEVLINK_ATTR_LINECARD_INDEX attribute is present, we call the lc-flash >> op. How does this sound? > >We talked about this earlier in the thread, I think. If you need both >info and flash per LC just make them a separate devlink instance and >let them have all the objects they need. Then just put the instance >name under lc info. I don't follow :/ What do you mean be "separate devlink instance" here? Could you draw me an example?
On Wed, 25 May 2022 08:20:45 +0200 Jiri Pirko wrote: > >We talked about this earlier in the thread, I think. If you need both > >info and flash per LC just make them a separate devlink instance and > >let them have all the objects they need. Then just put the instance > >name under lc info. > > I don't follow :/ What do you mean be "separate devlink instance" here? > Could you draw me an example? Separate instance: for (i = 0; i < sw->num_lcs; i++) { devlink_register(&sw->lc_dl[i]); devlink_line_card_link(&sw->lc[i], &sw->lc_dl[i]); } then report that under the linecard nla_nest_start(msg, DEVLINK_SUBORDINATE_INSTANCE); devlink_nl_put_handle(msg, lc->devlink); nla_nest_end(msg...) then user can update the linecard like any devlink instance, switch, NIC etc. It's better code reuse and I don't see any downside, TBH.
Wed, May 25, 2022 at 05:50:54PM CEST, kuba@kernel.org wrote: >On Wed, 25 May 2022 08:20:45 +0200 Jiri Pirko wrote: >> >We talked about this earlier in the thread, I think. If you need both >> >info and flash per LC just make them a separate devlink instance and >> >let them have all the objects they need. Then just put the instance >> >name under lc info. >> >> I don't follow :/ What do you mean be "separate devlink instance" here? >> Could you draw me an example? > >Separate instance: > > for (i = 0; i < sw->num_lcs; i++) { > devlink_register(&sw->lc_dl[i]); > devlink_line_card_link(&sw->lc[i], &sw->lc_dl[i]); > } > >then report that under the linecard > > nla_nest_start(msg, DEVLINK_SUBORDINATE_INSTANCE); > devlink_nl_put_handle(msg, lc->devlink); > nla_nest_end(msg...) > >then user can update the linecard like any devlink instance, switch, >NIC etc. It's better code reuse and I don't see any downside, TBH. I think you already suggested something like that. What you mean is that linecard would be modeled as any other devlink instance. Sorry, but that does not make any sense to me at all :/ Should devlink port not be devlink port and rather modeled as a separate devlink instance too? Same to the rest of the objects we already have. Should we have a tree of same devlink objects each overloaded to some specific type. If yes, that is completely changing the devlink modelling. The handle of the devlink object is bus_name/dev_name. In other words, there is a device on a bus and for each you can create devlink instance. Linecards is not on a any bus, is is not modeled in /sys. What would be the handle? What you suggest seems to me like completely broken :/ I don't understand the motivation :/ The only I can think of that you will have the same "devlink dev flash" mechanism, but that's it. I'm not sure I understand why we cannot resolve the flashing otherwise, for example as I suggested with "devlink lc info" and "devlink lc flash". What is wrong with that?
Thu, May 26, 2022 at 11:05:53AM CEST, jiri@resnulli.us wrote: >Wed, May 25, 2022 at 05:50:54PM CEST, kuba@kernel.org wrote: >>On Wed, 25 May 2022 08:20:45 +0200 Jiri Pirko wrote: >>> >We talked about this earlier in the thread, I think. If you need both >>> >info and flash per LC just make them a separate devlink instance and >>> >let them have all the objects they need. Then just put the instance >>> >name under lc info. >>> >>> I don't follow :/ What do you mean be "separate devlink instance" here? >>> Could you draw me an example? >> >>Separate instance: >> >> for (i = 0; i < sw->num_lcs; i++) { >> devlink_register(&sw->lc_dl[i]); >> devlink_line_card_link(&sw->lc[i], &sw->lc_dl[i]); >> } >> >>then report that under the linecard >> >> nla_nest_start(msg, DEVLINK_SUBORDINATE_INSTANCE); >> devlink_nl_put_handle(msg, lc->devlink); >> nla_nest_end(msg...) >> >>then user can update the linecard like any devlink instance, switch, >>NIC etc. It's better code reuse and I don't see any downside, TBH. > >I think you already suggested something like that. What you mean is that >linecard would be modeled as any other devlink instance. Sorry, but that >does not make any sense to me at all :/ Should devlink port not be >devlink port and rather modeled as a separate devlink instance too? Same >to the rest of the objects we already have. Should we have a tree of >same devlink objects each overloaded to some specific type. If yes, that >is completely changing the devlink modelling. > >The handle of the devlink object is bus_name/dev_name. In other words, >there is a device on a bus and for each you can create devlink instance. >Linecards is not on a any bus, is is not modeled in /sys. What would be >the handle? > >What you suggest seems to me like completely broken :/ After some thinking I think I understand why you suggest this. The versions in dev info and dev flash is there on devlink instance level, so you would like to reuse it. But it does not click. Having devlink instance just for this feels very artificial, unfitting and heavy. IDK. > >I don't understand the motivation :/ The only I can think of that you >will have the same "devlink dev flash" mechanism, but that's it. >I'm not sure I understand why we cannot resolve the flashing otherwise, >for example as I suggested with "devlink lc info" and "devlink lc >flash". What is wrong with that?
Wed, May 25, 2022 at 05:50:54PM CEST, kuba@kernel.org wrote: >On Wed, 25 May 2022 08:20:45 +0200 Jiri Pirko wrote: >> >We talked about this earlier in the thread, I think. If you need both >> >info and flash per LC just make them a separate devlink instance and >> >let them have all the objects they need. Then just put the instance >> >name under lc info. >> >> I don't follow :/ What do you mean be "separate devlink instance" here? >> Could you draw me an example? > >Separate instance: > > for (i = 0; i < sw->num_lcs; i++) { > devlink_register(&sw->lc_dl[i]); > devlink_line_card_link(&sw->lc[i], &sw->lc_dl[i]); > } > >then report that under the linecard > > nla_nest_start(msg, DEVLINK_SUBORDINATE_INSTANCE); > devlink_nl_put_handle(msg, lc->devlink); > nla_nest_end(msg...) > >then user can update the linecard like any devlink instance, switch, >NIC etc. It's better code reuse and I don't see any downside, TBH. Okay, I was thinking about this a litle bit more, and I would like to explore extending the components path. Exposing the components in "devlink dev info" and then using them in "devlink dev flash". LC could be just one of multiple potential users of components. Will send RFC soon.
On Thu, 26 May 2022 13:45:49 +0200 Jiri Pirko wrote: > >Separate instance: > > > > for (i = 0; i < sw->num_lcs; i++) { > > devlink_register(&sw->lc_dl[i]); > > devlink_line_card_link(&sw->lc[i], &sw->lc_dl[i]); > > } > > > >then report that under the linecard > > > > nla_nest_start(msg, DEVLINK_SUBORDINATE_INSTANCE); > > devlink_nl_put_handle(msg, lc->devlink); > > nla_nest_end(msg...) > > > >then user can update the linecard like any devlink instance, switch, > >NIC etc. It's better code reuse and I don't see any downside, TBH. > > Okay, I was thinking about this a litle bit more, and I would like to > explore extending the components path. Exposing the components in > "devlink dev info" and then using them in "devlink dev flash". LC could > be just one of multiple potential users of components. Will send RFC > soon. Feel free to send a mockup of the devlink user space outputs. The core code for devlink is just meaningless marshaling anyway.
Thu, May 26, 2022 at 07:35:39PM CEST, kuba@kernel.org wrote: >On Thu, 26 May 2022 13:45:49 +0200 Jiri Pirko wrote: >> >Separate instance: >> > >> > for (i = 0; i < sw->num_lcs; i++) { >> > devlink_register(&sw->lc_dl[i]); >> > devlink_line_card_link(&sw->lc[i], &sw->lc_dl[i]); >> > } >> > >> >then report that under the linecard >> > >> > nla_nest_start(msg, DEVLINK_SUBORDINATE_INSTANCE); >> > devlink_nl_put_handle(msg, lc->devlink); >> > nla_nest_end(msg...) >> > >> >then user can update the linecard like any devlink instance, switch, >> >NIC etc. It's better code reuse and I don't see any downside, TBH. >> >> Okay, I was thinking about this a litle bit more, and I would like to >> explore extending the components path. Exposing the components in >> "devlink dev info" and then using them in "devlink dev flash". LC could >> be just one of multiple potential users of components. Will send RFC >> soon. > >Feel free to send a mockup of the devlink user space outputs. >The core code for devlink is just meaningless marshaling anyway. Okay. So the output of devlink dev info would be extended by "components" nest. This nest would carry array of components which contain versions. The name of the component is openin each array member nest: $ devlink dev info pci/0000:01:00.0: driver mlxsw_spectrum2 versions: fixed: hw.revision A0 fw.psid MT_0000000199 running: fw.version 29.2010.2302 fw 29.2010.2302 components: lc1: versions: fixed: hw.revision 0 fw.psid MT_0000000111 running: fw 19.2010.1310 ini.version 4 lc2: versions: fixed: hw.revision 0 fw.psid MT_0000000111 running: fw 19.2010.1310 ini.version 4 someothercomponentname: versions: running: fw: 888 Now on top of exsisting "devlink dev flash" cmd without component, user may specify the component name from the array above: $ devlink dev flash pci/0000:01:00.0 component lc1 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 $ devlink dev flash pci/0000:01:00.0 component someothercomponentname file foo.bin Note this is generic vehicle, line cards would benefit but it is usable for multiple ASIC FW partitions for example. Note that on "devlink dev flash" there is no change. This is implemented currently. Only "devlink dev info" is extended to show the component list.
On Fri, 27 May 2022 09:27:47 +0200 Jiri Pirko wrote: > Okay. So the output of devlink dev info would be extended by > "components" nest. This nest would carry array of components which > contain versions. The name of the component is openin each array member > nest: > > $ devlink dev info > pci/0000:01:00.0: > driver mlxsw_spectrum2 > versions: > fixed: > hw.revision A0 > fw.psid MT_0000000199 > running: > fw.version 29.2010.2302 > fw 29.2010.2302 > components: > lc1: Is the "lc1" free-form or generated by the core based on subobjects? Is it carried as a string or object type + id? I guess my suggestion of a CLI mockup has proven its weakness :) > versions: > fixed: > hw.revision 0 > fw.psid MT_0000000111 > running: > fw 19.2010.1310 > ini.version 4 > lc2: > versions: > fixed: > hw.revision 0 > fw.psid MT_0000000111 > running: > fw 19.2010.1310 > ini.version 4 > someothercomponentname: > versions: > running: > fw: 888 > > Now on top of exsisting "devlink dev flash" cmd without component, user > may specify the component name from the array above: > > $ devlink dev flash pci/0000:01:00.0 component lc1 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 > > $ devlink dev flash pci/0000:01:00.0 component someothercomponentname file foo.bin > > Note this is generic vehicle, line cards would benefit but it is usable > for multiple ASIC FW partitions for example. > > Note that on "devlink dev flash" there is no change. This is implemented > currently. Only "devlink dev info" is extended to show the component > list. I sort of assumed that the DEVLINK_ATTR_INFO_VERSION_NAME is the component, the docs also use the word "component" for it. For the nfp for instance we had "fw.app" for the datapath microcode and "fw.mgmt" for the control processor. These are separate partitions on the flash. I don't think we ever implemented writing them separately but it's certainly was our internal plan at some point.
Sat, May 28, 2022 at 02:10:38AM CEST, kuba@kernel.org wrote: >On Fri, 27 May 2022 09:27:47 +0200 Jiri Pirko wrote: >> Okay. So the output of devlink dev info would be extended by >> "components" nest. This nest would carry array of components which >> contain versions. The name of the component is openin each array member >> nest: >> >> $ devlink dev info >> pci/0000:01:00.0: >> driver mlxsw_spectrum2 >> versions: >> fixed: >> hw.revision A0 >> fw.psid MT_0000000199 >> running: >> fw.version 29.2010.2302 >> fw 29.2010.2302 >> components: >> lc1: > >Is the "lc1" free-form or generated by the core based on subobjects? >Is it carried as a string or object type + id? It could be both: 1) for line cards I plan to have a helper to have this generated by core 2) for other FW objects, it is up to the driver. > >I guess my suggestion of a CLI mockup has proven its weakness :) I'm not sure I understand what you mean by this sentence. Could you please be more blunt? You know, my english is not so good to understand some hidden meanings :) > >> versions: >> fixed: >> hw.revision 0 >> fw.psid MT_0000000111 >> running: >> fw 19.2010.1310 >> ini.version 4 >> lc2: >> versions: >> fixed: >> hw.revision 0 >> fw.psid MT_0000000111 >> running: >> fw 19.2010.1310 >> ini.version 4 >> someothercomponentname: >> versions: >> running: >> fw: 888 >> >> Now on top of exsisting "devlink dev flash" cmd without component, user >> may specify the component name from the array above: >> >> $ devlink dev flash pci/0000:01:00.0 component lc1 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 >> >> $ devlink dev flash pci/0000:01:00.0 component someothercomponentname file foo.bin >> >> Note this is generic vehicle, line cards would benefit but it is usable >> for multiple ASIC FW partitions for example. >> >> Note that on "devlink dev flash" there is no change. This is implemented >> currently. Only "devlink dev info" is extended to show the component >> list. > >I sort of assumed that the DEVLINK_ATTR_INFO_VERSION_NAME is the >component, the docs also use the word "component" for it. Okay, that I didn't see. > >For the nfp for instance we had "fw.app" for the datapath microcode and >"fw.mgmt" for the control processor. These are separate partitions on >the flash. I don't think we ever implemented writing them separately >but it's certainly was our internal plan at some point. Okay, so what you say it, we already have components in "devlink dev info". Like you pointed out as an example: fw.app fw.mgmt so the flash comment would be: devlink dev flash pci/0000:01:00.0 component fw.app file foo.bin devlink dev flash pci/0000:01:00.0 component fw.mgmt file bar.bin ? If yes, what should be the default in case component is not defined? Do we need to expose it in "devlink dev info"? How? So to extend this existing facility with my line card example, we would have: $ devlink dev info pci/0000:01:00.0: driver mlxsw_spectrum2 versions: fixed: hw.revision A0 fw.psid MT_0000000199 lc1.hw.revision 0 lc1.fw.psid MT_0000000111 lc2.hw.revision 0 lc2.fw.psid MT_0000000111 running: fw.version 29.2010.2302 fw 29.2010.2302 lc1.fw 19.2010.1310 lc1.ini.version 4 lc2.fw 19.2010.1310 lc2.ini.version 4 And then: devlink dev flash pci/0000:01:00.0 component lc1.fw file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 Does this sound correct? Also, to avoid free-form, I can imagine to have per-linecard info_get() op which would be called for each line card from devlink_nl_info_fill() and prefix the "lcX" automatically without driver being involved. Sounds good? Thanks!
On 5/24/22 8:31 AM, Jiri Pirko wrote: > > $ devlink lc info pci/0000:01:00.0 lc 8 > pci/0000:01:00.0: ... > > $ devlink lc flash pci/0000:01:00.0 lc 8 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 A lot of your proposed syntax for devlink commands has 'lc' twice. If 'lc' is the subcommand, then you are managing a linecard making 'lc' before the '8' redundant. How about 'slot 8' or something along those lines?
On Sat, 28 May 2022 11:09:01 +0200 Jiri Pirko wrote: > Sat, May 28, 2022 at 02:10:38AM CEST, kuba@kernel.org wrote: > > > >Is the "lc1" free-form or generated by the core based on subobjects? > >Is it carried as a string or object type + id? > > It could be both: > 1) for line cards I plan to have a helper to have this generated by core > 2) for other FW objects, it is up to the driver. Did you mean "either" or "both"? > >I guess my suggestion of a CLI mockup has proven its weakness :) > > I'm not sure I understand what you mean by this sentence. Could you > please be more blunt? You know, my english is not so good to understand > some hidden meanings :) The question of what kind of attribute "lc1" is carried in would had been answered in posting of a code, while CLI mockup doesn't provide such detail. > > > >I sort of assumed that the DEVLINK_ATTR_INFO_VERSION_NAME is the > >component, the docs also use the word "component" for it. > > Okay, that I didn't see. > > > > >For the nfp for instance we had "fw.app" for the datapath microcode and > >"fw.mgmt" for the control processor. These are separate partitions on > >the flash. I don't think we ever implemented writing them separately > >but it's certainly was our internal plan at some point. > > Okay, so what you say it, we already have components in "devlink dev > info". Like you pointed out as an example: > fw.app > fw.mgmt > so the flash comment would be: > devlink dev flash pci/0000:01:00.0 component fw.app file foo.bin > devlink dev flash pci/0000:01:00.0 component fw.mgmt file bar.bin > ? Correct. > If yes, what should be the default in case component is not defined? Do > we need to expose it in "devlink dev info"? How? Not defined as in someone tries to flash component X but there is no version for X in info? > So to extend this existing facility with my line card example, we would > have: > > $ devlink dev info > pci/0000:01:00.0: > driver mlxsw_spectrum2 > versions: > fixed: > hw.revision A0 > fw.psid MT_0000000199 > lc1.hw.revision 0 > lc1.fw.psid MT_0000000111 > lc2.hw.revision 0 > lc2.fw.psid MT_0000000111 > running: > fw.version 29.2010.2302 > fw 29.2010.2302 > lc1.fw 19.2010.1310 > lc1.ini.version 4 > lc2.fw 19.2010.1310 > lc2.ini.version 4 > > And then: > devlink dev flash pci/0000:01:00.0 component lc1.fw file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 > > Does this sound correct? I think I suggested something like that in the past, but back then I was assuming that lc FW would come from the same large FW bundle file as the control plan FW, and we would not have to use the component. Let's step back and look from the automation perspective again. Assuming we don't want to hardcode matching "lc$i" there how can a generic FW update service scan the dev info and decide on what dev flash command to fire off? > Also, to avoid free-form, I can imagine to have per-linecard info_get() op > which would be called for each line card from devlink_nl_info_fill() and > prefix the "lcX" automatically without driver being involved. > > Sounds good? Hm. That's moving the matryoshka-ing of the objects from the uAPI level to the internals. If we don't do the string prefix but instead pass the subobject info to the user space as an attribute per version we can at least avoid per-subobject commands (DEVLINK_CMD_LINECARD_INFO_GET). Much closer to how health reporters are implemented than how params are done, so I think it is a good direction. We still need to iron out how the automation can go over the main FW and sub-objects in a generic way. I still think full devlink sub-instance is better because we will end up needing params or health. Fake devices can be made with auxbus or otherwise. But if you really don't want sub-instances we can explore the above.
Sat, May 28, 2022 at 09:02:53PM CEST, kuba@kernel.org wrote: >On Sat, 28 May 2022 11:09:01 +0200 Jiri Pirko wrote: >> Sat, May 28, 2022 at 02:10:38AM CEST, kuba@kernel.org wrote: >> > >> >Is the "lc1" free-form or generated by the core based on subobjects? >> >Is it carried as a string or object type + id? >> >> It could be both: >> 1) for line cards I plan to have a helper to have this generated by core >> 2) for other FW objects, it is up to the driver. > >Did you mean "either" or "both"? Both. > >> >I guess my suggestion of a CLI mockup has proven its weakness :) >> >> I'm not sure I understand what you mean by this sentence. Could you >> please be more blunt? You know, my english is not so good to understand >> some hidden meanings :) > >The question of what kind of attribute "lc1" is carried in would had >been answered in posting of a code, while CLI mockup doesn't provide >such detail. > >> > >> >I sort of assumed that the DEVLINK_ATTR_INFO_VERSION_NAME is the >> >component, the docs also use the word "component" for it. >> >> Okay, that I didn't see. >> >> > >> >For the nfp for instance we had "fw.app" for the datapath microcode and >> >"fw.mgmt" for the control processor. These are separate partitions on >> >the flash. I don't think we ever implemented writing them separately >> >but it's certainly was our internal plan at some point. >> >> Okay, so what you say it, we already have components in "devlink dev >> info". Like you pointed out as an example: >> fw.app >> fw.mgmt >> so the flash comment would be: >> devlink dev flash pci/0000:01:00.0 component fw.app file foo.bin >> devlink dev flash pci/0000:01:00.0 component fw.mgmt file bar.bin >> ? > >Correct. > >> If yes, what should be the default in case component is not defined? Do >> we need to expose it in "devlink dev info"? How? > >Not defined as in someone tries to flash component X but there is no >version for X in info? > >> So to extend this existing facility with my line card example, we would >> have: >> >> $ devlink dev info >> pci/0000:01:00.0: >> driver mlxsw_spectrum2 >> versions: >> fixed: >> hw.revision A0 >> fw.psid MT_0000000199 >> lc1.hw.revision 0 >> lc1.fw.psid MT_0000000111 >> lc2.hw.revision 0 >> lc2.fw.psid MT_0000000111 >> running: >> fw.version 29.2010.2302 >> fw 29.2010.2302 >> lc1.fw 19.2010.1310 >> lc1.ini.version 4 >> lc2.fw 19.2010.1310 >> lc2.ini.version 4 >> >> And then: >> devlink dev flash pci/0000:01:00.0 component lc1.fw file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 >> >> Does this sound correct? > >I think I suggested something like that in the past, but back then Yes, you did. >I was assuming that lc FW would come from the same large FW bundle >file as the control plan FW, and we would not have to use the component. > >Let's step back and look from the automation perspective again. >Assuming we don't want to hardcode matching "lc$i" there how can >a generic FW update service scan the dev info and decide on what >dev flash command to fire off? Hardcode matching lc$i? I don't follow. It is a part of the version/component name. So if devlink dev info outputs: lc2.fw 19.2010.1310 then you use for devlink dev flash: devlink dev flash pci/0000:01:00.0 component lc2.fw file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 Same name, same string. What am I missing? > >> Also, to avoid free-form, I can imagine to have per-linecard info_get() op >> which would be called for each line card from devlink_nl_info_fill() and >> prefix the "lcX" automatically without driver being involved. >> >> Sounds good? > >Hm. That's moving the matryoshka-ing of the objects from the uAPI level >to the internals. > >If we don't do the string prefix but instead pass the subobject info to >the user space as an attribute per version we can at least avoid >per-subobject commands (DEVLINK_CMD_LINECARD_INFO_GET). Much closer to >how health reporters are implemented than how params are done, so I >think it is a good direction. Sorry, I'm a bit lost. Could you please provide some example about how you envision it? For me it is a guessing game :/ My guess is you would like to add to the version nest where DEVLINK_ATTR_INFO_VERSION_NAME resides for example DEVLINK_ATTR_LINECARD_INDEX? Correct? > >We still need to iron out how the automation can go over the main FW >and sub-objects in a generic way. > >I still think full devlink sub-instance is better because we will end >up needing params or health. Fake devices can be made with auxbus or >otherwise. But if you really don't want sub-instances we can explore >the above. I really don't.
Sat, May 28, 2022 at 05:58:56PM CEST, dsahern@gmail.com wrote: >On 5/24/22 8:31 AM, Jiri Pirko wrote: >> >> $ devlink lc info pci/0000:01:00.0 lc 8 >> pci/0000:01:00.0: > >... > >> >> $ devlink lc flash pci/0000:01:00.0 lc 8 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 > > >A lot of your proposed syntax for devlink commands has 'lc' twice. If >'lc' is the subcommand, then you are managing a linecard making 'lc' >before the '8' redundant. How about 'slot 8' or something along those lines? Well, there is 1:1 match between cmd line options and output, as always. Object name is one thing, the option name is different. It is quite common to name them both the same. I'm not sure I understand why it would be an issue.
On Sun, 29 May 2022 11:23:01 +0200 Jiri Pirko wrote: > >Let's step back and look from the automation perspective again. > >Assuming we don't want to hardcode matching "lc$i" there how can > >a generic FW update service scan the dev info and decide on what > >dev flash command to fire off? > > Hardcode matching lc$i? I don't follow. It is a part of the > version/component name. > So if devlink dev info outputs: > lc2.fw 19.2010.1310 > then you use for devlink dev flash: > devlink dev flash pci/0000:01:00.0 component lc2.fw file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 > Same name, same string. > > What am I missing? Nevermind, I think we can iterate over all the groupings. Since I hope you agreed that component has an established meaning can we use group instead? > >> Also, to avoid free-form, I can imagine to have per-linecard info_get() op > >> which would be called for each line card from devlink_nl_info_fill() and > >> prefix the "lcX" automatically without driver being involved. > >> > >> Sounds good? > > > >Hm. That's moving the matryoshka-ing of the objects from the uAPI level > >to the internals. > > > >If we don't do the string prefix but instead pass the subobject info to > >the user space as an attribute per version we can at least avoid > >per-subobject commands (DEVLINK_CMD_LINECARD_INFO_GET). Much closer to > >how health reporters are implemented than how params are done, so I > >think it is a good direction. > > Sorry, I'm a bit lost. Could you please provide some example about how > you envision it? For me it is a guessing game :/ > My guess is you would like to add to the version nest where > DEVLINK_ATTR_INFO_VERSION_NAME resides for example > DEVLINK_ATTR_LINECARD_INDEX? > > Correct? Yup.
On 5/29/22 3:24 AM, Jiri Pirko wrote: > Sat, May 28, 2022 at 05:58:56PM CEST, dsahern@gmail.com wrote: >> On 5/24/22 8:31 AM, Jiri Pirko wrote: >>> >>> $ devlink lc info pci/0000:01:00.0 lc 8 >>> pci/0000:01:00.0: >> >> ... >> >>> >>> $ devlink lc flash pci/0000:01:00.0 lc 8 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 >> >> >> A lot of your proposed syntax for devlink commands has 'lc' twice. If >> 'lc' is the subcommand, then you are managing a linecard making 'lc' >> before the '8' redundant. How about 'slot 8' or something along those lines? > > Well, there is 1:1 match between cmd line options and output, as always. > > Object name is one thing, the option name is different. It is quite > common to name them both the same. I'm not sure I understand why it > would be an issue. > example? To me it says something is off in your model when you want to use the same keyword twice in a command line.
Tue, May 31, 2022 at 04:11:16AM CEST, dsahern@gmail.com wrote: >On 5/29/22 3:24 AM, Jiri Pirko wrote: >> Sat, May 28, 2022 at 05:58:56PM CEST, dsahern@gmail.com wrote: >>> On 5/24/22 8:31 AM, Jiri Pirko wrote: >>>> >>>> $ devlink lc info pci/0000:01:00.0 lc 8 >>>> pci/0000:01:00.0: >>> >>> ... >>> >>>> >>>> $ devlink lc flash pci/0000:01:00.0 lc 8 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 >>> >>> >>> A lot of your proposed syntax for devlink commands has 'lc' twice. If >>> 'lc' is the subcommand, then you are managing a linecard making 'lc' >>> before the '8' redundant. How about 'slot 8' or something along those lines? >> >> Well, there is 1:1 match between cmd line options and output, as always. >> >> Object name is one thing, the option name is different. It is quite >> common to name them both the same. I'm not sure I understand why it >> would be an issue. >> > >example? To me it says something is off in your model when you want to >use the same keyword twice in a command line. man devlink-trap man devlink-sb
Mon, May 30, 2022 at 09:54:08PM CEST, kuba@kernel.org wrote: >On Sun, 29 May 2022 11:23:01 +0200 Jiri Pirko wrote: >> >Let's step back and look from the automation perspective again. >> >Assuming we don't want to hardcode matching "lc$i" there how can >> >a generic FW update service scan the dev info and decide on what >> >dev flash command to fire off? >> >> Hardcode matching lc$i? I don't follow. It is a part of the >> version/component name. >> So if devlink dev info outputs: >> lc2.fw 19.2010.1310 >> then you use for devlink dev flash: >> devlink dev flash pci/0000:01:00.0 component lc2.fw file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 >> Same name, same string. >> >> What am I missing? > >Nevermind, I think we can iterate over all the groupings. >Since I hope you agreed that component has an established Yeah, component=version. I will send a RFC soon that tights it together. >meaning can we use group instead? Group of what? Could you provide me example what you mean? > >> >> Also, to avoid free-form, I can imagine to have per-linecard info_get() op >> >> which would be called for each line card from devlink_nl_info_fill() and >> >> prefix the "lcX" automatically without driver being involved. >> >> >> >> Sounds good? >> > >> >Hm. That's moving the matryoshka-ing of the objects from the uAPI level >> >to the internals. >> > >> >If we don't do the string prefix but instead pass the subobject info to >> >the user space as an attribute per version we can at least avoid >> >per-subobject commands (DEVLINK_CMD_LINECARD_INFO_GET). Much closer to >> >how health reporters are implemented than how params are done, so I >> >think it is a good direction. >> >> Sorry, I'm a bit lost. Could you please provide some example about how >> you envision it? For me it is a guessing game :/ >> My guess is you would like to add to the version nest where >> DEVLINK_ATTR_INFO_VERSION_NAME resides for example >> DEVLINK_ATTR_LINECARD_INDEX? >> >> Correct? > >Yup. Hmm, in that case, I'm not sure how to do this. As cmd options and outputs should match, we would have: devlink dev info lc2.fw 19.2010.1310 here lc2 and fw are concatenated from DEVLINK_ATTR_LINECARD_INDEX and DEVLINK_ATTR_INFO_VERSION_NAME Now on devlink dev flash side, when I pass "component lc2.fw", how could the "devlink dev flash" know to divide it to DEVLINK_ATTR_LINECARD_INDEX and FLASH_COMPONENT? Should I parse the cmd line option and figure the "lcX." prefix into an attribute? Or, we would have to have something like: devlink dev flash pci/0000:01:00.0 lc 2 component fw file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 But to be consistent with the output, we would have to change "devlink dev info" to something like: pci/0000:01:00.0: versions: running: fw 1.2.3 fw.mgmt 10.20.30 lc 2 fw 19.2010.1310 But that would break the existing JSON output, because "running" is an array: "running": { "fw": "1.2.3", "fw.mgmt": "10.20.30" }, So probably better to stick to "lcx.y" notation in both devlink dev info and flash and split/squash to attributes internally. What do you think?
On Tue, 31 May 2022 09:11:27 +0200 Jiri Pirko wrote: > >Nevermind, I think we can iterate over all the groupings. > >Since I hope you agreed that component has an established > > Yeah, component=version. I will send a RFC soon that tights it together. > > >meaning can we use group instead? > > Group of what? Could you provide me example what you mean? Group of components. As explained component has an existing meaning, we can't reuse the term with a different one now. > >> Sorry, I'm a bit lost. Could you please provide some example about how > >> you envision it? For me it is a guessing game :/ > >> My guess is you would like to add to the version nest where > >> DEVLINK_ATTR_INFO_VERSION_NAME resides for example > >> DEVLINK_ATTR_LINECARD_INDEX? > >> > >> Correct? > > > >Yup. > > Hmm, in that case, I'm not sure how to do this. As cmd options and > outputs should match, we would have: > > devlink dev info > lc2.fw 19.2010.1310 > > here lc2 and fw are concatenated from DEVLINK_ATTR_LINECARD_INDEX and DEVLINK_ATTR_INFO_VERSION_NAME lc2 is the group name. > Now on devlink dev flash side, when I pass "component lc2.fw", how could > the "devlink dev flash" know to divide it to DEVLINK_ATTR_LINECARD_INDEX > and FLASH_COMPONENT? Should I parse the cmd line option and figure the > "lcX." prefix into an attribute? > > Or, we would have to have something like: > devlink dev flash pci/0000:01:00.0 lc 2 component fw file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 Yup, it'll make DaveA happy as well. > But to be consistent with the output, we would have to change "devlink > dev info" to something like: > pci/0000:01:00.0: > versions: > running: > fw 1.2.3 > fw.mgmt 10.20.30 > lc 2 fw 19.2010.1310 Yup. > But that would break the existing JSON output, because "running" is an array: > "running": { > "fw": "1.2.3", > "fw.mgmt": "10.20.30" > }, No, the lc versions should be in separate nests. Since they are not updated when flashing main FW mixing them into existing versions would break uAPI. > So probably better to stick to "lcx.y" notation in both devlink dev info > and flash and split/squash to attributes internally. What do you think? BTW how do you intend to activate the new FW? Extend the reload command?
Tue, May 31, 2022 at 05:05:55PM CEST, kuba@kernel.org wrote: >On Tue, 31 May 2022 09:11:27 +0200 Jiri Pirko wrote: >> >Nevermind, I think we can iterate over all the groupings. >> >Since I hope you agreed that component has an established >> >> Yeah, component=version. I will send a RFC soon that tights it together. >> >> >meaning can we use group instead? >> >> Group of what? Could you provide me example what you mean? > >Group of components. As explained component has an existing meaning, >we can't reuse the term with a different one now. I still don't follow. I don't want to reuse it. Really, could you be more specific and show examples, please? > >> >> Sorry, I'm a bit lost. Could you please provide some example about how >> >> you envision it? For me it is a guessing game :/ >> >> My guess is you would like to add to the version nest where >> >> DEVLINK_ATTR_INFO_VERSION_NAME resides for example >> >> DEVLINK_ATTR_LINECARD_INDEX? >> >> >> >> Correct? >> > >> >Yup. >> >> Hmm, in that case, I'm not sure how to do this. As cmd options and >> outputs should match, we would have: >> >> devlink dev info >> lc2.fw 19.2010.1310 >> >> here lc2 and fw are concatenated from DEVLINK_ATTR_LINECARD_INDEX and DEVLINK_ATTR_INFO_VERSION_NAME > >lc2 is the group name. > >> Now on devlink dev flash side, when I pass "component lc2.fw", how could >> the "devlink dev flash" know to divide it to DEVLINK_ATTR_LINECARD_INDEX >> and FLASH_COMPONENT? Should I parse the cmd line option and figure the >> "lcX." prefix into an attribute? >> >> Or, we would have to have something like: >> devlink dev flash pci/0000:01:00.0 lc 2 component fw file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 > >Yup, it'll make DaveA happy as well. > >> But to be consistent with the output, we would have to change "devlink >> dev info" to something like: >> pci/0000:01:00.0: >> versions: >> running: >> fw 1.2.3 >> fw.mgmt 10.20.30 >> lc 2 fw 19.2010.1310 > >Yup. Set, you say "yup" but below you say it should be in a separate nest. That is confusing me. > >> But that would break the existing JSON output, because "running" is an array: >> "running": { >> "fw": "1.2.3", >> "fw.mgmt": "10.20.30" >> }, > >No, the lc versions should be in separate nests. Since they are not >updated when flashing main FW mixing them into existing versions would >break uAPI. Could you please draw it out for me exacly as you envision it? We are dancing around it, I can't really understand what exactly do you mean. > >> So probably better to stick to "lcx.y" notation in both devlink dev info >> and flash and split/squash to attributes internally. What do you think? > >BTW how do you intend to activate the new FW? Extend the reload command? Not sure now. Extending reload is an option. Have to think about it.
On Tue, 31 May 2022 17:51:36 +0200 Jiri Pirko wrote: > Tue, May 31, 2022 at 05:05:55PM CEST, kuba@kernel.org wrote: > >> Group of what? Could you provide me example what you mean? > > > >Group of components. As explained component has an existing meaning, > >we can't reuse the term with a different one now. > > I still don't follow. I don't want to reuse it. > Really, could you be more specific and show examples, please? What you had in your previous examples, just don't call it components but come up with a new term: $ devlink dev info pci/0000:01:00.0: driver mlxsw_spectrum2 versions: fixed: hw.revision A0 fw.psid MT_0000000199 running: fw.version 29.2010.2302 fw 29.2010.2302 groups? sections? subordinates? : <= here lc1: versions: fixed: hw.revision 0 fw.psid MT_0000000111 running: fw 19.2010.1310 ini.version 4 Note that lc1 is not a nest at netlink level but user space can group the attrs pretty trivially. > >> But to be consistent with the output, we would have to change "devlink > >> dev info" to something like: > >> pci/0000:01:00.0: > >> versions: > >> running: > >> fw 1.2.3 > >> fw.mgmt 10.20.30 > >> lc 2 fw 19.2010.1310 > > > >Yup. > > Set, you say "yup" but below you say it should be in a separate nest. > That is confusing me. Ah, sorry. I hope the above is clear. > >> But that would break the existing JSON output, because "running" is an array: > >> "running": { > >> "fw": "1.2.3", > >> "fw.mgmt": "10.20.30" > >> }, > > > >No, the lc versions should be in separate nests. Since they are not > >updated when flashing main FW mixing them into existing versions would > >break uAPI. > > Could you please draw it out for me exacly as you envision it? We are > dancing around it, I can't really understand what exactly do you mean. Why would I prototype your feature for you? I prefer a separate dl instance. If you want to explore other options the "drawing out" is on you :/ > >> So probably better to stick to "lcx.y" notation in both devlink dev info > >> and flash and split/squash to attributes internally. What do you think? > > > >BTW how do you intend to activate the new FW? Extend the reload command? > > Not sure now. Extending reload is an option. Have to think about it.
Tue, May 31, 2022 at 06:08:52PM CEST, kuba@kernel.org wrote: >On Tue, 31 May 2022 17:51:36 +0200 Jiri Pirko wrote: >> Tue, May 31, 2022 at 05:05:55PM CEST, kuba@kernel.org wrote: >> >> Group of what? Could you provide me example what you mean? >> > >> >Group of components. As explained component has an existing meaning, >> >we can't reuse the term with a different one now. >> >> I still don't follow. I don't want to reuse it. >> Really, could you be more specific and show examples, please? > >What you had in your previous examples, just don't call it components >but come up with a new term: > >$ devlink dev info >pci/0000:01:00.0: > driver mlxsw_spectrum2 > versions: > fixed: > hw.revision A0 > fw.psid MT_0000000199 > running: > fw.version 29.2010.2302 > fw 29.2010.2302 > groups? sections? subordinates? : <= here > lc1: > versions: > fixed: > hw.revision 0 > fw.psid MT_0000000111 > running: > fw 19.2010.1310 > ini.version 4 > >Note that lc1 is not a nest at netlink level but user space can group >the attrs pretty trivially. Awesome! I think now it is clearer. To be in sync with devlink dev flash cmd option, we have to have "lc 1" here, have to think how that can be done. > >> >> But to be consistent with the output, we would have to change "devlink >> >> dev info" to something like: >> >> pci/0000:01:00.0: >> >> versions: >> >> running: >> >> fw 1.2.3 >> >> fw.mgmt 10.20.30 >> >> lc 2 fw 19.2010.1310 >> > >> >Yup. >> >> Set, you say "yup" but below you say it should be in a separate nest. >> That is confusing me. > >Ah, sorry. I hope the above is clear. > >> >> But that would break the existing JSON output, because "running" is an array: >> >> "running": { >> >> "fw": "1.2.3", >> >> "fw.mgmt": "10.20.30" >> >> }, >> > >> >No, the lc versions should be in separate nests. Since they are not >> >updated when flashing main FW mixing them into existing versions would >> >break uAPI. >> >> Could you please draw it out for me exacly as you envision it? We are >> dancing around it, I can't really understand what exactly do you mean. > >Why would I prototype your feature for you? I prefer a separate dl >instance. If you want to explore other options the "drawing out" is >on you :/ Well, you are basically leading my arm when I'm drawing the thing. Looks like you exactly know what you are looking for. That is why. If you let me to the stuff my way, we would be already done. You have to decide which way you want this. And again, for the record, I strongly believe that a separate dl instance for this does not make any sense at all :/ I wonder why you still think it does. > >> >> So probably better to stick to "lcx.y" notation in both devlink dev info >> >> and flash and split/squash to attributes internally. What do you think? >> > >> >BTW how do you intend to activate the new FW? Extend the reload command? >> >> Not sure now. Extending reload is an option. Have to think about it. > >
On Tue, 31 May 2022 21:34:42 +0200 Jiri Pirko wrote: > And again, for the record, I strongly believe that a separate dl > instance for this does not make any sense at all :/ I wonder why you > still think it does. For purely software reuse reasons. I think the line cards will require a lot of the same attributes as the full devlink instance, so making them a subobject which can have all the same attributes is poor SW arch. Think about it from OOP perspective, you'd definitely factor all that stuff out to an abstract class. We can't do that in netlink but whatever just make it a full dl instance and describe the link between the two. Most NIC vendors (everyone excluding Netronome?) decided that devlink instance is equivalent to a bus device which IIUC it was not supposed to be. It was supposed to be the whole ASIC. If we're okay to stretch the definition of a dl instance to be "any independently controllable unit of HW" for NICs then IDK why we can't make a line card a dl instance. Are you afraid of hiding dependencies?
Wed, Jun 01, 2022 at 12:41:59AM CEST, kuba@kernel.org wrote: >On Tue, 31 May 2022 21:34:42 +0200 Jiri Pirko wrote: >> And again, for the record, I strongly believe that a separate dl >> instance for this does not make any sense at all :/ I wonder why you >> still think it does. > >For purely software reuse reasons. I think the line cards will require >a lot of the same attributes as the full devlink instance, so making >them a subobject which can have all the same attributes is poor SW arch. Sure, I understand the motivation. >Think about it from OOP perspective, you'd definitely factor all that >stuff out to an abstract class. We can't do that in netlink but whatever >just make it a full dl instance and describe the link between the two. > >Most NIC vendors (everyone excluding Netronome?) decided that devlink >instance is equivalent to a bus device which IIUC it was not supposed >to be. It was supposed to be the whole ASIC. If we're okay to stretch I agree, that is incorrect. That is why I was thinking about sort of "alias" to make it right (2 PF devlink instances would be one connected by alias). Not implemented yet though :/ >the definition of a dl instance to be "any independently controllable >unit of HW" for NICs then IDK why we can't make a line card a dl >instance. Well, it is not independently controllable. Well, truth is, that in our current implementation, there is one independent "configuration", and that is flash burn of the gearbox. It is done using a "tunnelling" register which encapsulates register communication what is done during flash burning. > >Are you afraid of hiding dependencies? Not really, I'm just not sure I see it is worth the excercise. In czech, we have this saying: "kanon na vrabce". I think that the following picture is better than any translation :) https://i.iinfo.cz/images/72/shutterstock-com-kanon-delo-ptak-vrabec-strilet-1.jpg Will think about it some more.