mbox series

[net-next,00/11] mlxsw: Implement dev info and dev flash for line cards

Message ID 20220614123326.69745-1-jiri@resnulli.us (mailing list archive)
Headers show
Series mlxsw: Implement dev info and dev flash for line cards | expand

Message

Jiri Pirko June 14, 2022, 12:33 p.m. UTC
From: Jiri Pirko <jiri@nvidia.com>

This patchset implements two features:
1) "devlink dev info" is exposed for line card (patches 3-8)
2) "devlink dev flash" is implemented for line card gearbox
   flashing (patch 9)

For every line card, "a nested" auxiliary device is created which
allows to bind the features mentioned above (patch 2).

The relationship between line card and its auxiliary dev devlink
is carried over extra line card netlink attribute (patches 1 and 3).

Examples:

$ devlink lc show pci/0000:01:00.0 lc 1
pci/0000:01:00.0:
  lc 1 state active type 16x100G nested_devlink auxiliary/mlxsw_core.lc.0
    supported_types:
       16x100G

$ devlink dev show auxiliary/mlxsw_core.lc.0
auxiliary/mlxsw_core.lc.0

$ devlink dev info auxiliary/mlxsw_core.lc.0
auxiliary/mlxsw_core.lc.0:
  versions:
      fixed:
        hw.revision 0
        fw.psid MT_0000000749
      running:
        ini.version 4
        fw 19.2010.1312

$ devlink dev flash auxiliary/mlxsw_core.lc.0 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2

Jiri Pirko (11):
  devlink: introduce nested devlink entity for line card
  mlxsw: core_linecards: Introduce per line card auxiliary device
  mlxsw: core_linecard_dev: Set nested devlink relationship for a line
    card
  mlxsw: core_linecards: Expose HW revision and INI version
  mlxsw: reg: Extend MDDQ by device_info
  mlxsw: core_linecards: Probe provisioned line cards for devices and
    expose FW version
  mlxsw: reg: Add Management DownStream Device Tunneling Register
  mlxsw: core_linecards: Expose device PSID over device info
  mlxsw: core_linecards: Implement line card device flashing
  selftests: mlxsw: Check line card info on provisioned line card
  selftests: mlxsw: Check line card info on activated line card

 Documentation/networking/devlink/mlxsw.rst    |  24 ++
 drivers/net/ethernet/mellanox/mlxsw/Kconfig   |   1 +
 drivers/net/ethernet/mellanox/mlxsw/Makefile  |   2 +-
 drivers/net/ethernet/mellanox/mlxsw/core.c    |  44 +-
 drivers/net/ethernet/mellanox/mlxsw/core.h    |  35 ++
 .../mellanox/mlxsw/core_linecard_dev.c        | 180 ++++++++
 .../ethernet/mellanox/mlxsw/core_linecards.c  | 403 ++++++++++++++++++
 drivers/net/ethernet/mellanox/mlxsw/reg.h     | 173 +++++++-
 include/net/devlink.h                         |   2 +
 include/uapi/linux/devlink.h                  |   2 +
 net/core/devlink.c                            |  42 ++
 .../drivers/net/mlxsw/devlink_linecard.sh     |  54 +++
 12 files changed, 948 insertions(+), 14 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlxsw/core_linecard_dev.c

Comments

Ido Schimmel June 15, 2022, 9:13 a.m. UTC | #1
On Tue, Jun 14, 2022 at 02:33:15PM +0200, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@nvidia.com>
> 
> This patchset implements two features:
> 1) "devlink dev info" is exposed for line card (patches 3-8)
> 2) "devlink dev flash" is implemented for line card gearbox
>    flashing (patch 9)
> 
> For every line card, "a nested" auxiliary device is created which
> allows to bind the features mentioned above (patch 2).

The design choice to use an auxiliary device for each line card needs to
be explained in the cover letter. From what I can tell, the motivation
is to reuse the above devlink uAPI for line cards as opposed to using
the "component" attribute or adding new uAPI. This is achieved by
creating a devlink instance for each line card. The auxiliary device is
needed because each devlink instance is expected to be associated with a
device. Does this constitute proper use of the auxiliary bus?

> 
> The relationship between line card and its auxiliary dev devlink
> is carried over extra line card netlink attribute (patches 1 and 3).
> 
> Examples:
> 
> $ devlink lc show pci/0000:01:00.0 lc 1
> pci/0000:01:00.0:
>   lc 1 state active type 16x100G nested_devlink auxiliary/mlxsw_core.lc.0

Can we try to use the index of the line card as the identifier of the
auxiliary device?

>     supported_types:
>        16x100G
> 
> $ devlink dev show auxiliary/mlxsw_core.lc.0
> auxiliary/mlxsw_core.lc.0

I assume that the auxiliary device cannot outlive line card. Does that
mean that as part of reload of the primary devlink instance the nested
devlink instance is removed? If so, did you check the reload flow with
lockdep to ensure there aren't any problems?

> 
> $ devlink dev info auxiliary/mlxsw_core.lc.0
> auxiliary/mlxsw_core.lc.0:
>   versions:
>       fixed:
>         hw.revision 0
>         fw.psid MT_0000000749
>       running:
>         ini.version 4
>         fw 19.2010.1312
> 
> $ devlink dev flash auxiliary/mlxsw_core.lc.0 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2

How is this firmware activated? It is usually done after reload, but I
don't see reload implementation for the line card devlink instance.

> 
> Jiri Pirko (11):
>   devlink: introduce nested devlink entity for line card
>   mlxsw: core_linecards: Introduce per line card auxiliary device
>   mlxsw: core_linecard_dev: Set nested devlink relationship for a line
>     card
>   mlxsw: core_linecards: Expose HW revision and INI version
>   mlxsw: reg: Extend MDDQ by device_info
>   mlxsw: core_linecards: Probe provisioned line cards for devices and
>     expose FW version
>   mlxsw: reg: Add Management DownStream Device Tunneling Register
>   mlxsw: core_linecards: Expose device PSID over device info
>   mlxsw: core_linecards: Implement line card device flashing
>   selftests: mlxsw: Check line card info on provisioned line card
>   selftests: mlxsw: Check line card info on activated line card
> 
>  Documentation/networking/devlink/mlxsw.rst    |  24 ++
>  drivers/net/ethernet/mellanox/mlxsw/Kconfig   |   1 +
>  drivers/net/ethernet/mellanox/mlxsw/Makefile  |   2 +-
>  drivers/net/ethernet/mellanox/mlxsw/core.c    |  44 +-
>  drivers/net/ethernet/mellanox/mlxsw/core.h    |  35 ++
>  .../mellanox/mlxsw/core_linecard_dev.c        | 180 ++++++++
>  .../ethernet/mellanox/mlxsw/core_linecards.c  | 403 ++++++++++++++++++
>  drivers/net/ethernet/mellanox/mlxsw/reg.h     | 173 +++++++-
>  include/net/devlink.h                         |   2 +
>  include/uapi/linux/devlink.h                  |   2 +
>  net/core/devlink.c                            |  42 ++
>  .../drivers/net/mlxsw/devlink_linecard.sh     |  54 +++
>  12 files changed, 948 insertions(+), 14 deletions(-)
>  create mode 100644 drivers/net/ethernet/mellanox/mlxsw/core_linecard_dev.c
> 
> -- 
> 2.35.3
>
Jiri Pirko June 15, 2022, 5:40 p.m. UTC | #2
Wed, Jun 15, 2022 at 11:13:35AM CEST, idosch@nvidia.com wrote:
>On Tue, Jun 14, 2022 at 02:33:15PM +0200, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@nvidia.com>
>> 
>> This patchset implements two features:
>> 1) "devlink dev info" is exposed for line card (patches 3-8)
>> 2) "devlink dev flash" is implemented for line card gearbox
>>    flashing (patch 9)
>> 
>> For every line card, "a nested" auxiliary device is created which
>> allows to bind the features mentioned above (patch 2).
>
>The design choice to use an auxiliary device for each line card needs to
>be explained in the cover letter. From what I can tell, the motivation
>is to reuse the above devlink uAPI for line cards as opposed to using
>the "component" attribute or adding new uAPI. This is achieved by
>creating a devlink instance for each line card. The auxiliary device is
>needed because each devlink instance is expected to be associated with a
>device. Does this constitute proper use of the auxiliary bus?

Right, will describe this in cover letter.


>
>> 
>> The relationship between line card and its auxiliary dev devlink
>> is carried over extra line card netlink attribute (patches 1 and 3).
>> 
>> Examples:
>> 
>> $ devlink lc show pci/0000:01:00.0 lc 1
>> pci/0000:01:00.0:
>>   lc 1 state active type 16x100G nested_devlink auxiliary/mlxsw_core.lc.0
>
>Can we try to use the index of the line card as the identifier of the
>auxiliary device?

Not really. We would have a collision if there are 2 mlxsw instances.


>
>>     supported_types:
>>        16x100G
>> 
>> $ devlink dev show auxiliary/mlxsw_core.lc.0
>> auxiliary/mlxsw_core.lc.0
>
>I assume that the auxiliary device cannot outlive line card. Does that
>mean that as part of reload of the primary devlink instance the nested
>devlink instance is removed? If so, did you check the reload flow with
>lockdep to ensure there aren't any problems?

As I wrote in the other email, line card auxdev should be removed during
linecard_fini(), will add that to v2.


>
>> 
>> $ devlink dev info auxiliary/mlxsw_core.lc.0
>> auxiliary/mlxsw_core.lc.0:
>>   versions:
>>       fixed:
>>         hw.revision 0
>>         fw.psid MT_0000000749
>>       running:
>>         ini.version 4
>>         fw 19.2010.1312
>> 
>> $ devlink dev flash auxiliary/mlxsw_core.lc.0 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2
>
>How is this firmware activated? It is usually done after reload, but I
>don't see reload implementation for the line card devlink instance.

Currently, only devlink dev reload of the whole mlxsw instance or
unprovision/provision of a line card.

>
>> 
>> Jiri Pirko (11):
>>   devlink: introduce nested devlink entity for line card
>>   mlxsw: core_linecards: Introduce per line card auxiliary device
>>   mlxsw: core_linecard_dev: Set nested devlink relationship for a line
>>     card
>>   mlxsw: core_linecards: Expose HW revision and INI version
>>   mlxsw: reg: Extend MDDQ by device_info
>>   mlxsw: core_linecards: Probe provisioned line cards for devices and
>>     expose FW version
>>   mlxsw: reg: Add Management DownStream Device Tunneling Register
>>   mlxsw: core_linecards: Expose device PSID over device info
>>   mlxsw: core_linecards: Implement line card device flashing
>>   selftests: mlxsw: Check line card info on provisioned line card
>>   selftests: mlxsw: Check line card info on activated line card
>> 
>>  Documentation/networking/devlink/mlxsw.rst    |  24 ++
>>  drivers/net/ethernet/mellanox/mlxsw/Kconfig   |   1 +
>>  drivers/net/ethernet/mellanox/mlxsw/Makefile  |   2 +-
>>  drivers/net/ethernet/mellanox/mlxsw/core.c    |  44 +-
>>  drivers/net/ethernet/mellanox/mlxsw/core.h    |  35 ++
>>  .../mellanox/mlxsw/core_linecard_dev.c        | 180 ++++++++
>>  .../ethernet/mellanox/mlxsw/core_linecards.c  | 403 ++++++++++++++++++
>>  drivers/net/ethernet/mellanox/mlxsw/reg.h     | 173 +++++++-
>>  include/net/devlink.h                         |   2 +
>>  include/uapi/linux/devlink.h                  |   2 +
>>  net/core/devlink.c                            |  42 ++
>>  .../drivers/net/mlxsw/devlink_linecard.sh     |  54 +++
>>  12 files changed, 948 insertions(+), 14 deletions(-)
>>  create mode 100644 drivers/net/ethernet/mellanox/mlxsw/core_linecard_dev.c
>> 
>> -- 
>> 2.35.3
>>
Ido Schimmel June 16, 2022, 7:03 a.m. UTC | #3
On Wed, Jun 15, 2022 at 07:40:34PM +0200, Jiri Pirko wrote:
> Wed, Jun 15, 2022 at 11:13:35AM CEST, idosch@nvidia.com wrote:
> >On Tue, Jun 14, 2022 at 02:33:15PM +0200, Jiri Pirko wrote:
> >> $ devlink dev flash auxiliary/mlxsw_core.lc.0 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2
> >
> >How is this firmware activated? It is usually done after reload, but I
> >don't see reload implementation for the line card devlink instance.
> 
> Currently, only devlink dev reload of the whole mlxsw instance or
> unprovision/provision of a line card.

OK, please at least mention it in the commit message that adds flashing
support.

What about implementing reload support as unprovision/provision?
Jiri Pirko June 16, 2022, 1:11 p.m. UTC | #4
Thu, Jun 16, 2022 at 09:03:52AM CEST, idosch@nvidia.com wrote:
>On Wed, Jun 15, 2022 at 07:40:34PM +0200, Jiri Pirko wrote:
>> Wed, Jun 15, 2022 at 11:13:35AM CEST, idosch@nvidia.com wrote:
>> >On Tue, Jun 14, 2022 at 02:33:15PM +0200, Jiri Pirko wrote:
>> >> $ devlink dev flash auxiliary/mlxsw_core.lc.0 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2
>> >
>> >How is this firmware activated? It is usually done after reload, but I
>> >don't see reload implementation for the line card devlink instance.
>> 
>> Currently, only devlink dev reload of the whole mlxsw instance or
>> unprovision/provision of a line card.
>
>OK, please at least mention it in the commit message that adds flashing
>support.
>
>What about implementing reload support as unprovision/provision?

Yes, that can be done eventually. I was thinking about that as well.
Ido Schimmel June 16, 2022, 2:47 p.m. UTC | #5
On Thu, Jun 16, 2022 at 03:11:57PM +0200, Jiri Pirko wrote:
> Thu, Jun 16, 2022 at 09:03:52AM CEST, idosch@nvidia.com wrote:
> >On Wed, Jun 15, 2022 at 07:40:34PM +0200, Jiri Pirko wrote:
> >> Wed, Jun 15, 2022 at 11:13:35AM CEST, idosch@nvidia.com wrote:
> >> >On Tue, Jun 14, 2022 at 02:33:15PM +0200, Jiri Pirko wrote:
> >> >> $ devlink dev flash auxiliary/mlxsw_core.lc.0 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2
> >> >
> >> >How is this firmware activated? It is usually done after reload, but I
> >> >don't see reload implementation for the line card devlink instance.
> >> 
> >> Currently, only devlink dev reload of the whole mlxsw instance or
> >> unprovision/provision of a line card.
> >
> >OK, please at least mention it in the commit message that adds flashing
> >support.
> >
> >What about implementing reload support as unprovision/provision?
> 
> Yes, that can be done eventually. I was thinking about that as well.

This patch should come before the one that adds flashing. Then both the
primary and nested devlink instances maintain the same semantics with
regards to firmware flashing / activation.
Jiri Pirko June 16, 2022, 4:37 p.m. UTC | #6
Thu, Jun 16, 2022 at 04:47:50PM CEST, idosch@nvidia.com wrote:
>On Thu, Jun 16, 2022 at 03:11:57PM +0200, Jiri Pirko wrote:
>> Thu, Jun 16, 2022 at 09:03:52AM CEST, idosch@nvidia.com wrote:
>> >On Wed, Jun 15, 2022 at 07:40:34PM +0200, Jiri Pirko wrote:
>> >> Wed, Jun 15, 2022 at 11:13:35AM CEST, idosch@nvidia.com wrote:
>> >> >On Tue, Jun 14, 2022 at 02:33:15PM +0200, Jiri Pirko wrote:
>> >> >> $ devlink dev flash auxiliary/mlxsw_core.lc.0 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2
>> >> >
>> >> >How is this firmware activated? It is usually done after reload, but I
>> >> >don't see reload implementation for the line card devlink instance.
>> >> 
>> >> Currently, only devlink dev reload of the whole mlxsw instance or
>> >> unprovision/provision of a line card.
>> >
>> >OK, please at least mention it in the commit message that adds flashing
>> >support.
>> >
>> >What about implementing reload support as unprovision/provision?
>> 
>> Yes, that can be done eventually. I was thinking about that as well.
>
>This patch should come before the one that adds flashing. Then both the
>primary and nested devlink instances maintain the same semantics with
>regards to firmware flashing / activation.

Ok.
Shannon Nelson June 18, 2022, 6:12 a.m. UTC | #7
On 6/15/22 10:40 AM, Jiri Pirko wrote:
> Wed, Jun 15, 2022 at 11:13:35AM CEST, idosch@nvidia.com wrote:
>> On Tue, Jun 14, 2022 at 02:33:15PM +0200, Jiri Pirko wrote:
>>> From: Jiri Pirko <jiri@nvidia.com>
>>>
>>> This patchset implements two features:
>>> 1) "devlink dev info" is exposed for line card (patches 3-8)
>>> 2) "devlink dev flash" is implemented for line card gearbox
>>>     flashing (patch 9)
>>>
>>> For every line card, "a nested" auxiliary device is created which
>>> allows to bind the features mentioned above (patch 2).
>>
[...]>>
>>>
>>> The relationship between line card and its auxiliary dev devlink
>>> is carried over extra line card netlink attribute (patches 1 and 3).
>>>
>>> Examples:
>>>
>>> $ devlink lc show pci/0000:01:00.0 lc 1
>>> pci/0000:01:00.0:
>>>    lc 1 state active type 16x100G nested_devlink auxiliary/mlxsw_core.lc.0
>>
>> Can we try to use the index of the line card as the identifier of the
>> auxiliary device?
> 
> Not really. We would have a collision if there are 2 mlxsw instances.
> 

Can you encode the base device's PCI info into the auxiliary device's id
to make it unique?  Or maybe have each mlxsw instance have a unique ida 
value to encode in the linecard auxiliary device id?

sln
Jiri Pirko June 27, 2022, 8:43 a.m. UTC | #8
Sat, Jun 18, 2022 at 08:12:20AM CEST, snelson@pensando.io wrote:
>
>
>On 6/15/22 10:40 AM, Jiri Pirko wrote:
>> Wed, Jun 15, 2022 at 11:13:35AM CEST, idosch@nvidia.com wrote:
>> > On Tue, Jun 14, 2022 at 02:33:15PM +0200, Jiri Pirko wrote:
>> > > From: Jiri Pirko <jiri@nvidia.com>
>> > > 
>> > > This patchset implements two features:
>> > > 1) "devlink dev info" is exposed for line card (patches 3-8)
>> > > 2) "devlink dev flash" is implemented for line card gearbox
>> > >     flashing (patch 9)
>> > > 
>> > > For every line card, "a nested" auxiliary device is created which
>> > > allows to bind the features mentioned above (patch 2).
>> > 
>[...]>>
>> > > 
>> > > The relationship between line card and its auxiliary dev devlink
>> > > is carried over extra line card netlink attribute (patches 1 and 3).
>> > > 
>> > > Examples:
>> > > 
>> > > $ devlink lc show pci/0000:01:00.0 lc 1
>> > > pci/0000:01:00.0:
>> > >    lc 1 state active type 16x100G nested_devlink auxiliary/mlxsw_core.lc.0
>> > 
>> > Can we try to use the index of the line card as the identifier of the
>> > auxiliary device?
>> 
>> Not really. We would have a collision if there are 2 mlxsw instances.
>> 
>
>Can you encode the base device's PCI info into the auxiliary device's id

Would look odd to he PCI BDF in auxdev addsess, wouldn't it?


>to make it unique?  Or maybe have each mlxsw instance have a unique ida value
>to encode in the linecard auxiliary device id?

Well, which value would that bring? It would be dynamic random number.
How the use would use that tho figure out the relation to the mlxsw
instance?

>
>sln
Shannon Nelson June 27, 2022, 6:38 p.m. UTC | #9
On 6/27/22 1:43 AM, Jiri Pirko wrote:
> Sat, Jun 18, 2022 at 08:12:20AM CEST, snelson@pensando.io wrote:
>>
>>
>> On 6/15/22 10:40 AM, Jiri Pirko wrote:
>>> Wed, Jun 15, 2022 at 11:13:35AM CEST, idosch@nvidia.com wrote:
>>>> On Tue, Jun 14, 2022 at 02:33:15PM +0200, Jiri Pirko wrote:
>>>>> From: Jiri Pirko <jiri@nvidia.com>
>>>>>
>>>>> This patchset implements two features:
>>>>> 1) "devlink dev info" is exposed for line card (patches 3-8)
>>>>> 2) "devlink dev flash" is implemented for line card gearbox
>>>>>      flashing (patch 9)
>>>>>
>>>>> For every line card, "a nested" auxiliary device is created which
>>>>> allows to bind the features mentioned above (patch 2).
>>>>
>> [...]>>
>>>>>
>>>>> The relationship between line card and its auxiliary dev devlink
>>>>> is carried over extra line card netlink attribute (patches 1 and 3).
>>>>>
>>>>> Examples:
>>>>>
>>>>> $ devlink lc show pci/0000:01:00.0 lc 1
>>>>> pci/0000:01:00.0:
>>>>>     lc 1 state active type 16x100G nested_devlink auxiliary/mlxsw_core.lc.0
>>>>
>>>> Can we try to use the index of the line card as the identifier of the
>>>> auxiliary device?
>>>
>>> Not really. We would have a collision if there are 2 mlxsw instances.
>>>
>>
>> Can you encode the base device's PCI info into the auxiliary device's id
> 
> Would look odd to he PCI BDF in auxdev addsess, wouldn't it?

Sure, it looks a little odd to see something like mycore.app.1281, but 
it does afford the auxiliary driver, and any other observer, a way to 
figure out which device it is representing.  This also works nicely when 
trying to associate an auxiliary driver instance for a VF with the 
matching VF PCI driver instance.

sln
Jakub Kicinski June 27, 2022, 6:52 p.m. UTC | #10
On Mon, 27 Jun 2022 11:38:50 -0700 Shannon Nelson wrote:
> >> Can you encode the base device's PCI info into the auxiliary device's id  
> > 
> > Would look odd to he PCI BDF in auxdev addsess, wouldn't it?  
> 
> Sure, it looks a little odd to see something like mycore.app.1281, but 
> it does afford the auxiliary driver, and any other observer, a way to 
> figure out which device it is representing.  This also works nicely when 
> trying to associate an auxiliary driver instance for a VF with the 
> matching VF PCI driver instance.

I'd personally not mind divorcing devlink from bus devices a little
more. On one hand we have cases like this where there's naturally no
bus device, on the other we have multi-link PCI devices which want to
straddle NUMA nodes but otherwise are just a logical unit.
Jiri Pirko June 28, 2022, 7:06 a.m. UTC | #11
Mon, Jun 27, 2022 at 08:52:09PM CEST, kuba@kernel.org wrote:
>On Mon, 27 Jun 2022 11:38:50 -0700 Shannon Nelson wrote:
>> >> Can you encode the base device's PCI info into the auxiliary device's id  
>> > 
>> > Would look odd to he PCI BDF in auxdev addsess, wouldn't it?  
>> 
>> Sure, it looks a little odd to see something like mycore.app.1281, but 
>> it does afford the auxiliary driver, and any other observer, a way to 
>> figure out which device it is representing.  This also works nicely when 
>> trying to associate an auxiliary driver instance for a VF with the 
>> matching VF PCI driver instance.
>
>I'd personally not mind divorcing devlink from bus devices a little
>more. On one hand we have cases like this where there's naturally no

How exactly do you envision to do this? There is a good reason to have
the handle based on bus/name, as it is constant and predictable.


>bus device, on the other we have multi-link PCI devices which want to
>straddle NUMA nodes but otherwise are just a logical unit.