mbox series

[net-next,v1,0/5] Devlink reload and missed notifications fix

Message ID cover.1632916329.git.leonro@nvidia.com (mailing list archive)
Headers show
Series Devlink reload and missed notifications fix | expand

Message

Leon Romanovsky Sept. 29, 2021, noon UTC
From: Leon Romanovsky <leonro@nvidia.com>

Changelog:
v1:
 * Missed removal of extra WARN_ON
 * Added "ops parameter to macro as Dan suggested.
v0: https://lore.kernel.org/all/cover.1632909221.git.leonro@nvidia.com

-------------------------------------------------------------------
Hi,

This series starts from the fixing the bug introduced by implementing
devlink delayed notifications logic, where I missed some of the
notifications functions.

The rest series provides a way to dynamically set devlink ops that is
needed for mlx5 multiport device and starts cleanup by removing
not-needed logic.

In the next series, we will delete various publish API, drop general
lock, annotate the code and rework logic around devlink->lock.

All this is possible because driver initialization is separated from the
user input now.

Thanks

Leon Romanovsky (5):
  devlink: Add missed notifications iterators
  devlink: Allow modification of devlink ops
  devlink: Allow set specific ops callbacks dynamically
  net/mlx5: Register separate reload devlink ops for multiport device
  devlink: Delete reload enable/disable interface

 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c |   6 +-
 .../net/ethernet/cavium/liquidio/lio_main.c   |   2 +-
 .../freescale/dpaa2/dpaa2-eth-devlink.c       |   2 +-
 .../hisilicon/hns3/hns3pf/hclge_devlink.c     |   5 +-
 .../hisilicon/hns3/hns3vf/hclgevf_devlink.c   |   5 +-
 .../net/ethernet/huawei/hinic/hinic_devlink.c |   2 +-
 drivers/net/ethernet/intel/ice/ice_devlink.c  |   2 +-
 .../marvell/octeontx2/af/rvu_devlink.c        |   2 +-
 .../marvell/prestera/prestera_devlink.c       |   2 +-
 drivers/net/ethernet/mellanox/mlx4/main.c     |   4 +-
 .../net/ethernet/mellanox/mlx5/core/devlink.c |  15 +-
 .../net/ethernet/mellanox/mlx5/core/main.c    |   3 -
 .../mellanox/mlx5/core/sf/dev/driver.c        |   5 +-
 drivers/net/ethernet/mellanox/mlxsw/core.c    |  12 +-
 drivers/net/ethernet/mscc/ocelot.h            |   2 +-
 drivers/net/ethernet/mscc/ocelot_net.c        |   2 +-
 .../net/ethernet/netronome/nfp/nfp_devlink.c  |   2 +-
 drivers/net/ethernet/netronome/nfp/nfp_main.h |   2 +-
 .../ethernet/pensando/ionic/ionic_devlink.c   |   2 +-
 drivers/net/ethernet/qlogic/qed/qed_devlink.c |   2 +-
 drivers/net/ethernet/ti/am65-cpsw-nuss.c      |   2 +-
 drivers/net/ethernet/ti/cpsw_new.c            |   2 +-
 drivers/net/netdevsim/dev.c                   |   5 +-
 drivers/ptp/ptp_ocp.c                         |   2 +-
 drivers/staging/qlge/qlge_main.c              |   2 +-
 include/net/devlink.h                         |  15 +-
 net/core/devlink.c                            | 156 ++++++++++--------
 net/dsa/dsa2.c                                |   2 +-
 28 files changed, 131 insertions(+), 134 deletions(-)

Comments

Jakub Kicinski Sept. 29, 2021, 1:40 p.m. UTC | #1
On Wed, 29 Sep 2021 15:00:41 +0300 Leon Romanovsky wrote:
> This series starts from the fixing the bug introduced by implementing
> devlink delayed notifications logic, where I missed some of the
> notifications functions.
> 
> The rest series provides a way to dynamically set devlink ops that is
> needed for mlx5 multiport device and starts cleanup by removing
> not-needed logic.
> 
> In the next series, we will delete various publish API, drop general
> lock, annotate the code and rework logic around devlink->lock.
> 
> All this is possible because driver initialization is separated from the
> user input now.

Swapping ops is a nasty hack in my book.

And all that to avoid having two op structures in one driver.
Or to avoid having counters which are always 0?

Sorry, at the very least you need better explanation for this.
Vladimir Oltean Sept. 29, 2021, 1:46 p.m. UTC | #2
On Wed, Sep 29, 2021 at 06:40:04AM -0700, Jakub Kicinski wrote:
> On Wed, 29 Sep 2021 15:00:41 +0300 Leon Romanovsky wrote:
> > This series starts from the fixing the bug introduced by implementing
> > devlink delayed notifications logic, where I missed some of the
> > notifications functions.
> >
> > The rest series provides a way to dynamically set devlink ops that is
> > needed for mlx5 multiport device and starts cleanup by removing
> > not-needed logic.
> >
> > In the next series, we will delete various publish API, drop general
> > lock, annotate the code and rework logic around devlink->lock.
> >
> > All this is possible because driver initialization is separated from the
> > user input now.
>
> Swapping ops is a nasty hack in my book.
>
> And all that to avoid having two op structures in one driver.
> Or to avoid having counters which are always 0?
>
> Sorry, at the very least you need better explanation for this.

Leon, while the discussion about this unfolds, can you please repost
patch 1 separately? :)
Thanks.
Jakub Kicinski Sept. 29, 2021, 1:56 p.m. UTC | #3
On Wed, 29 Sep 2021 13:46:38 +0000 Vladimir Oltean wrote:
> On Wed, Sep 29, 2021 at 06:40:04AM -0700, Jakub Kicinski wrote:
> > Swapping ops is a nasty hack in my book.
> >
> > And all that to avoid having two op structures in one driver.
> > Or to avoid having counters which are always 0?
> >
> > Sorry, at the very least you need better explanation for this.  
> 
> Leon, while the discussion about this unfolds, can you please repost
> patch 1 separately? :)

Yes, please and thanks! :)
Leon Romanovsky Sept. 29, 2021, 2:13 p.m. UTC | #4
On Wed, Sep 29, 2021 at 06:40:04AM -0700, Jakub Kicinski wrote:
> On Wed, 29 Sep 2021 15:00:41 +0300 Leon Romanovsky wrote:
> > This series starts from the fixing the bug introduced by implementing
> > devlink delayed notifications logic, where I missed some of the
> > notifications functions.
> > 
> > The rest series provides a way to dynamically set devlink ops that is
> > needed for mlx5 multiport device and starts cleanup by removing
> > not-needed logic.
> > 
> > In the next series, we will delete various publish API, drop general
> > lock, annotate the code and rework logic around devlink->lock.
> > 
> > All this is possible because driver initialization is separated from the
> > user input now.
> 
> Swapping ops is a nasty hack in my book.
> 
> And all that to avoid having two op structures in one driver.
> Or to avoid having counters which are always 0?

We don't need to advertise counters for feature that is not supported.
In multiport mlx5 devices, the reload functionality is not supported, so
this change at least make that device to behave like all other netdev
devices that don't support devlink reload.

The ops structure is set very early to make sure that internal devlink
routines will be able access driver back during initialization (btw very
questionable design choice), and at that stage the driver doesn't know
yet which device type it is going to drive.

So the answer is:
1. Can't have two structures.
2. Same behaviour across all netdev devices.

> 
> Sorry, at the very least you need better explanation for this.

Was it better explained now?
Leon Romanovsky Sept. 29, 2021, 2:20 p.m. UTC | #5
On Wed, Sep 29, 2021 at 06:56:21AM -0700, Jakub Kicinski wrote:
> On Wed, 29 Sep 2021 13:46:38 +0000 Vladimir Oltean wrote:
> > On Wed, Sep 29, 2021 at 06:40:04AM -0700, Jakub Kicinski wrote:
> > > Swapping ops is a nasty hack in my book.
> > >
> > > And all that to avoid having two op structures in one driver.
> > > Or to avoid having counters which are always 0?
> > >
> > > Sorry, at the very least you need better explanation for this.  
> > 
> > Leon, while the discussion about this unfolds, can you please repost
> > patch 1 separately? :)
> 
> Yes, please and thanks! :)

Done, thanks
https://lore.kernel.org/netdev/2ed1159291f2a589b013914f2b60d8172fc525c1.1632925030.git.leonro@nvidia.com/T/#u
Jakub Kicinski Sept. 29, 2021, 2:39 p.m. UTC | #6
On Wed, 29 Sep 2021 17:13:28 +0300 Leon Romanovsky wrote:
> On Wed, Sep 29, 2021 at 06:40:04AM -0700, Jakub Kicinski wrote:
> > On Wed, 29 Sep 2021 15:00:41 +0300 Leon Romanovsky wrote:  
> > > This series starts from the fixing the bug introduced by implementing
> > > devlink delayed notifications logic, where I missed some of the
> > > notifications functions.
> > > 
> > > The rest series provides a way to dynamically set devlink ops that is
> > > needed for mlx5 multiport device and starts cleanup by removing
> > > not-needed logic.
> > > 
> > > In the next series, we will delete various publish API, drop general
> > > lock, annotate the code and rework logic around devlink->lock.
> > > 
> > > All this is possible because driver initialization is separated from the
> > > user input now.  
> > 
> > Swapping ops is a nasty hack in my book.
> > 
> > And all that to avoid having two op structures in one driver.
> > Or to avoid having counters which are always 0?  
> 
> We don't need to advertise counters for feature that is not supported.
> In multiport mlx5 devices, the reload functionality is not supported, so
> this change at least make that device to behave like all other netdev
> devices that don't support devlink reload.
> 
> The ops structure is set very early to make sure that internal devlink
> routines will be able access driver back during initialization (btw very
> questionable design choice)

Indeed, is this fixable? Or now that devlink_register() was moved to 
the end of probe netdev can call ops before instance is registered?

> and at that stage the driver doesn't know
> yet which device type it is going to drive.
> 
> So the answer is:
> 1. Can't have two structures.

I still don't understand why. To be clear - swapping full op structures
is probably acceptable if it's a pure upgrade (existing pointers match).
Poking new ops into a structure (in alphabetical order if I understand
your reply to Greg, not destructor-before-contructor) is what I deem
questionable.

> 2. Same behaviour across all netdev devices.

Unclear what this is referring to.
Leon Romanovsky Sept. 29, 2021, 3:31 p.m. UTC | #7
On Wed, Sep 29, 2021 at 07:39:40AM -0700, Jakub Kicinski wrote:
> On Wed, 29 Sep 2021 17:13:28 +0300 Leon Romanovsky wrote:
> > On Wed, Sep 29, 2021 at 06:40:04AM -0700, Jakub Kicinski wrote:
> > > On Wed, 29 Sep 2021 15:00:41 +0300 Leon Romanovsky wrote:  
> > > > This series starts from the fixing the bug introduced by implementing
> > > > devlink delayed notifications logic, where I missed some of the
> > > > notifications functions.
> > > > 
> > > > The rest series provides a way to dynamically set devlink ops that is
> > > > needed for mlx5 multiport device and starts cleanup by removing
> > > > not-needed logic.
> > > > 
> > > > In the next series, we will delete various publish API, drop general
> > > > lock, annotate the code and rework logic around devlink->lock.
> > > > 
> > > > All this is possible because driver initialization is separated from the
> > > > user input now.  
> > > 
> > > Swapping ops is a nasty hack in my book.
> > > 
> > > And all that to avoid having two op structures in one driver.
> > > Or to avoid having counters which are always 0?  
> > 
> > We don't need to advertise counters for feature that is not supported.
> > In multiport mlx5 devices, the reload functionality is not supported, so
> > this change at least make that device to behave like all other netdev
> > devices that don't support devlink reload.
> > 
> > The ops structure is set very early to make sure that internal devlink
> > routines will be able access driver back during initialization (btw very
> > questionable design choice)
> 
> Indeed, is this fixable? Or now that devlink_register() was moved to 
> the end of probe netdev can call ops before instance is registered?
> 
> > and at that stage the driver doesn't know
> > yet which device type it is going to drive.
> > 
> > So the answer is:
> > 1. Can't have two structures.
> 
> I still don't understand why. To be clear - swapping full op structures
> is probably acceptable if it's a pure upgrade (existing pointers match).
> Poking new ops into a structure (in alphabetical order if I understand
> your reply to Greg, not destructor-before-contructor) is what I deem
> questionable.

It is sorted simply for readability and not for any other technical
reason.

Regarding new ops, this is how we are setting callbacks in RDMA based on
actual device support. It works like a charm.

> 
> > 2. Same behaviour across all netdev devices.
> 
> Unclear what this is referring to.

If your device doesn't support devlink reload, it won't print any
reload counters at all. It is not the case for the multiport mlx5
device. It doesn't support, but still present these counters.

Thanks
Jakub Kicinski Sept. 29, 2021, 5:55 p.m. UTC | #8
On Wed, 29 Sep 2021 18:31:51 +0300 Leon Romanovsky wrote:
> On Wed, Sep 29, 2021 at 07:39:40AM -0700, Jakub Kicinski wrote:
> > On Wed, 29 Sep 2021 17:13:28 +0300 Leon Romanovsky wrote:  
> > > We don't need to advertise counters for feature that is not supported.
> > > In multiport mlx5 devices, the reload functionality is not supported, so
> > > this change at least make that device to behave like all other netdev
> > > devices that don't support devlink reload.
> > > 
> > > The ops structure is set very early to make sure that internal devlink
> > > routines will be able access driver back during initialization (btw very
> > > questionable design choice)  
> > 
> > Indeed, is this fixable? Or now that devlink_register() was moved to 
> > the end of probe netdev can call ops before instance is registered?
> >   
> > > and at that stage the driver doesn't know
> > > yet which device type it is going to drive.
> > > 
> > > So the answer is:
> > > 1. Can't have two structures.  
> > 
> > I still don't understand why. To be clear - swapping full op structures
> > is probably acceptable if it's a pure upgrade (existing pointers match).
> > Poking new ops into a structure (in alphabetical order if I understand
> > your reply to Greg, not destructor-before-contructor) is what I deem
> > questionable.  
> 
> It is sorted simply for readability and not for any other technical
> reason.
> 
> Regarding new ops, this is how we are setting callbacks in RDMA based on
> actual device support. It works like a charm.
> 
> > > 2. Same behaviour across all netdev devices.  
> > 
> > Unclear what this is referring to.  
> 
> If your device doesn't support devlink reload, it won't print any
> reload counters at all. It is not the case for the multiport mlx5
> device. It doesn't support, but still present these counters.

There's myriad ways you can hide features.

Swapping ops is heavy handed and prone to data races, I don't like it.
Leon Romanovsky Sept. 29, 2021, 7:11 p.m. UTC | #9
On Wed, Sep 29, 2021 at 10:55:37AM -0700, Jakub Kicinski wrote:
> On Wed, 29 Sep 2021 18:31:51 +0300 Leon Romanovsky wrote:
> > On Wed, Sep 29, 2021 at 07:39:40AM -0700, Jakub Kicinski wrote:
> > > On Wed, 29 Sep 2021 17:13:28 +0300 Leon Romanovsky wrote:  
> > > > We don't need to advertise counters for feature that is not supported.
> > > > In multiport mlx5 devices, the reload functionality is not supported, so
> > > > this change at least make that device to behave like all other netdev
> > > > devices that don't support devlink reload.
> > > > 
> > > > The ops structure is set very early to make sure that internal devlink
> > > > routines will be able access driver back during initialization (btw very
> > > > questionable design choice)  
> > > 
> > > Indeed, is this fixable? Or now that devlink_register() was moved to 
> > > the end of probe netdev can call ops before instance is registered?
> > >   
> > > > and at that stage the driver doesn't know
> > > > yet which device type it is going to drive.
> > > > 
> > > > So the answer is:
> > > > 1. Can't have two structures.  
> > > 
> > > I still don't understand why. To be clear - swapping full op structures
> > > is probably acceptable if it's a pure upgrade (existing pointers match).
> > > Poking new ops into a structure (in alphabetical order if I understand
> > > your reply to Greg, not destructor-before-contructor) is what I deem
> > > questionable.  
> > 
> > It is sorted simply for readability and not for any other technical
> > reason.
> > 
> > Regarding new ops, this is how we are setting callbacks in RDMA based on
> > actual device support. It works like a charm.
> > 
> > > > 2. Same behaviour across all netdev devices.  
> > > 
> > > Unclear what this is referring to.  
> > 
> > If your device doesn't support devlink reload, it won't print any
> > reload counters at all. It is not the case for the multiport mlx5
> > device. It doesn't support, but still present these counters.
> 
> There's myriad ways you can hide features.
> 
> Swapping ops is heavy handed and prone to data races, I don't like it.

I'm not swapping, but setting only in supported devices.

Anyway, please give me a chance to present improved version of this
mechanism and we will continue from there.

Thanks