mbox series

[rdma-next,v2,0/5] Get rid of custom made module dependency

Message ID 20210401065715.565226-1-leon@kernel.org (mailing list archive)
Headers show
Series Get rid of custom made module dependency | expand

Message

Leon Romanovsky April 1, 2021, 6:57 a.m. UTC
From: Leon Romanovsky <leonro@nvidia.com>

Changelog:
v2:
 * kbuild spotted that I didn't delete all code in patch #5, so deleted
   even more ulp_ops derefences.
v1: https://lore.kernel.org/linux-rdma/20210329085212.257771-1-leon@kernel.org
 * Go much deeper and removed useless ULP indirection
v0: https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
-----------------------------------------------------------------------

The following series fixes issue spotted in [1], where bnxt_re driver
messed with module reference counting in order to implement symbol
dependency of bnxt_re and bnxt modules. All of this is done, when in
upstream we have only one ULP user of that bnxt module. The simple
declaration of exported symbol would do the trick.

This series removes that custom module_get/_put, which is not supposed
to be in the driver from the beginning and get rid of nasty indirection
logic that isn't relevant for the upstream code.

Such small changes allow us to simplify the bnxt code and my hope that
Devesh will continue where I stopped and remove struct bnxt_ulp_ops too.

Thanks

[1] https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org

Leon Romanovsky (5):
  RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select it
  RDMA/bnxt_re: Create direct symbolic link between bnxt modules
  RDMA/bnxt_re: Get rid of custom module reference counting
  net/bnxt: Remove useless check of non-existent ULP id
  net/bnxt: Use direct API instead of useless indirection

 drivers/infiniband/hw/bnxt_re/Kconfig         |   4 +-
 drivers/infiniband/hw/bnxt_re/main.c          |  93 ++-----
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   4 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 245 +++++++-----------
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h |  32 +--
 6 files changed, 119 insertions(+), 260 deletions(-)

Comments

Devesh Sharma April 3, 2021, 10:22 a.m. UTC | #1
On Thu, Apr 1, 2021 at 12:27 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> From: Leon Romanovsky <leonro@nvidia.com>
>
> Changelog:
> v2:
>  * kbuild spotted that I didn't delete all code in patch #5, so deleted
>    even more ulp_ops derefences.
> v1: https://lore.kernel.org/linux-rdma/20210329085212.257771-1-leon@kernel.org
>  * Go much deeper and removed useless ULP indirection
> v0: https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> -----------------------------------------------------------------------
>
> The following series fixes issue spotted in [1], where bnxt_re driver
> messed with module reference counting in order to implement symbol
> dependency of bnxt_re and bnxt modules. All of this is done, when in
> upstream we have only one ULP user of that bnxt module. The simple
> declaration of exported symbol would do the trick.
>
> This series removes that custom module_get/_put, which is not supposed
> to be in the driver from the beginning and get rid of nasty indirection
> logic that isn't relevant for the upstream code.
>
> Such small changes allow us to simplify the bnxt code and my hope that
> Devesh will continue where I stopped and remove struct bnxt_ulp_ops too.
>
> Thanks
>
> [1] https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
>
> Leon Romanovsky (5):
>   RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select it
>   RDMA/bnxt_re: Create direct symbolic link between bnxt modules
>   RDMA/bnxt_re: Get rid of custom module reference counting
>   net/bnxt: Remove useless check of non-existent ULP id
>   net/bnxt: Use direct API instead of useless indirection
>
>  drivers/infiniband/hw/bnxt_re/Kconfig         |   4 +-
>  drivers/infiniband/hw/bnxt_re/main.c          |  93 ++-----
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   4 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
>  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 245 +++++++-----------
>  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h |  32 +--
>  6 files changed, 119 insertions(+), 260 deletions(-)

Hi Leon,

After a couple of internal discussions we reached a conclusion to
implement the Auxbus driver interface and fix the problem once and for
all.
>
> --
> 2.30.2
>
Leon Romanovsky April 3, 2021, 11:42 a.m. UTC | #2
On Sat, Apr 03, 2021 at 03:52:13PM +0530, Devesh Sharma wrote:
> On Thu, Apr 1, 2021 at 12:27 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > Changelog:
> > v2:
> >  * kbuild spotted that I didn't delete all code in patch #5, so deleted
> >    even more ulp_ops derefences.
> > v1: https://lore.kernel.org/linux-rdma/20210329085212.257771-1-leon@kernel.org
> >  * Go much deeper and removed useless ULP indirection
> > v0: https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > -----------------------------------------------------------------------
> >
> > The following series fixes issue spotted in [1], where bnxt_re driver
> > messed with module reference counting in order to implement symbol
> > dependency of bnxt_re and bnxt modules. All of this is done, when in
> > upstream we have only one ULP user of that bnxt module. The simple
> > declaration of exported symbol would do the trick.
> >
> > This series removes that custom module_get/_put, which is not supposed
> > to be in the driver from the beginning and get rid of nasty indirection
> > logic that isn't relevant for the upstream code.
> >
> > Such small changes allow us to simplify the bnxt code and my hope that
> > Devesh will continue where I stopped and remove struct bnxt_ulp_ops too.
> >
> > Thanks
> >
> > [1] https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> >
> > Leon Romanovsky (5):
> >   RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select it
> >   RDMA/bnxt_re: Create direct symbolic link between bnxt modules
> >   RDMA/bnxt_re: Get rid of custom module reference counting
> >   net/bnxt: Remove useless check of non-existent ULP id
> >   net/bnxt: Use direct API instead of useless indirection
> >
> >  drivers/infiniband/hw/bnxt_re/Kconfig         |   4 +-
> >  drivers/infiniband/hw/bnxt_re/main.c          |  93 ++-----
> >  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   4 +-
> >  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
> >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 245 +++++++-----------
> >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h |  32 +--
> >  6 files changed, 119 insertions(+), 260 deletions(-)
> 
> Hi Leon,
> 
> After a couple of internal discussions we reached a conclusion to
> implement the Auxbus driver interface and fix the problem once and for
> all.

Thanks Devesh,

Jason, it looks like we can proceed with this patchset, because in
auxbus mode this module refcount and ULP indirection logics will be
removed anyway.

Thanks

> >
> > --
> > 2.30.2
> >
> 
> 
> -- 
> -Regards
> Devesh
Devesh Sharma April 8, 2021, 11:36 a.m. UTC | #3
On Sat, Apr 3, 2021 at 5:12 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Sat, Apr 03, 2021 at 03:52:13PM +0530, Devesh Sharma wrote:
> > On Thu, Apr 1, 2021 at 12:27 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > From: Leon Romanovsky <leonro@nvidia.com>
> > >
> > > Changelog:
> > > v2:
> > >  * kbuild spotted that I didn't delete all code in patch #5, so deleted
> > >    even more ulp_ops derefences.
> > > v1: https://lore.kernel.org/linux-rdma/20210329085212.257771-1-leon@kernel.org
> > >  * Go much deeper and removed useless ULP indirection
> > > v0: https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > -----------------------------------------------------------------------
> > >
> > > The following series fixes issue spotted in [1], where bnxt_re driver
> > > messed with module reference counting in order to implement symbol
> > > dependency of bnxt_re and bnxt modules. All of this is done, when in
> > > upstream we have only one ULP user of that bnxt module. The simple
> > > declaration of exported symbol would do the trick.
> > >
> > > This series removes that custom module_get/_put, which is not supposed
> > > to be in the driver from the beginning and get rid of nasty indirection
> > > logic that isn't relevant for the upstream code.
> > >
> > > Such small changes allow us to simplify the bnxt code and my hope that
> > > Devesh will continue where I stopped and remove struct bnxt_ulp_ops too.
> > >
> > > Thanks
> > >
> > > [1] https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > >
> > > Leon Romanovsky (5):
> > >   RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select it
> > >   RDMA/bnxt_re: Create direct symbolic link between bnxt modules
> > >   RDMA/bnxt_re: Get rid of custom module reference counting
> > >   net/bnxt: Remove useless check of non-existent ULP id
> > >   net/bnxt: Use direct API instead of useless indirection
> > >
> > >  drivers/infiniband/hw/bnxt_re/Kconfig         |   4 +-
> > >  drivers/infiniband/hw/bnxt_re/main.c          |  93 ++-----
> > >  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   4 +-
> > >  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
> > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 245 +++++++-----------
> > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h |  32 +--
> > >  6 files changed, 119 insertions(+), 260 deletions(-)
> >
> > Hi Leon,
> >
> > After a couple of internal discussions we reached a conclusion to
> > implement the Auxbus driver interface and fix the problem once and for
> > all.
>
> Thanks Devesh,
>
> Jason, it looks like we can proceed with this patchset, because in
> auxbus mode this module refcount and ULP indirection logics will be
> removed anyway.
>
> Thanks
Hi Leon,

In my internal testing, I am seeing a crash using the 3rd patch. I am
spending a few cycles on debugging it. expect my input in a day or so.
>
> > >
> > > --
> > > 2.30.2
> > >
> >
> >
> > --
> > -Regards
> > Devesh
>
>
Leon Romanovsky April 8, 2021, 11:44 a.m. UTC | #4
On Thu, Apr 08, 2021 at 05:06:24PM +0530, Devesh Sharma wrote:
> On Sat, Apr 3, 2021 at 5:12 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Sat, Apr 03, 2021 at 03:52:13PM +0530, Devesh Sharma wrote:
> > > On Thu, Apr 1, 2021 at 12:27 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > >
> > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > >
> > > > Changelog:
> > > > v2:
> > > >  * kbuild spotted that I didn't delete all code in patch #5, so deleted
> > > >    even more ulp_ops derefences.
> > > > v1: https://lore.kernel.org/linux-rdma/20210329085212.257771-1-leon@kernel.org
> > > >  * Go much deeper and removed useless ULP indirection
> > > > v0: https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > -----------------------------------------------------------------------
> > > >
> > > > The following series fixes issue spotted in [1], where bnxt_re driver
> > > > messed with module reference counting in order to implement symbol
> > > > dependency of bnxt_re and bnxt modules. All of this is done, when in
> > > > upstream we have only one ULP user of that bnxt module. The simple
> > > > declaration of exported symbol would do the trick.
> > > >
> > > > This series removes that custom module_get/_put, which is not supposed
> > > > to be in the driver from the beginning and get rid of nasty indirection
> > > > logic that isn't relevant for the upstream code.
> > > >
> > > > Such small changes allow us to simplify the bnxt code and my hope that
> > > > Devesh will continue where I stopped and remove struct bnxt_ulp_ops too.
> > > >
> > > > Thanks
> > > >
> > > > [1] https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > >
> > > > Leon Romanovsky (5):
> > > >   RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select it
> > > >   RDMA/bnxt_re: Create direct symbolic link between bnxt modules
> > > >   RDMA/bnxt_re: Get rid of custom module reference counting
> > > >   net/bnxt: Remove useless check of non-existent ULP id
> > > >   net/bnxt: Use direct API instead of useless indirection
> > > >
> > > >  drivers/infiniband/hw/bnxt_re/Kconfig         |   4 +-
> > > >  drivers/infiniband/hw/bnxt_re/main.c          |  93 ++-----
> > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   4 +-
> > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
> > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 245 +++++++-----------
> > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h |  32 +--
> > > >  6 files changed, 119 insertions(+), 260 deletions(-)
> > >
> > > Hi Leon,
> > >
> > > After a couple of internal discussions we reached a conclusion to
> > > implement the Auxbus driver interface and fix the problem once and for
> > > all.
> >
> > Thanks Devesh,
> >
> > Jason, it looks like we can proceed with this patchset, because in
> > auxbus mode this module refcount and ULP indirection logics will be
> > removed anyway.
> >
> > Thanks
> Hi Leon,
> 
> In my internal testing, I am seeing a crash using the 3rd patch. I am
> spending a few cycles on debugging it. expect my input in a day or so.

Can you please post the kernel crash report here?
I don't see how function rename in patch #3 can cause to the crash.

Thanks

> >
> > > >
> > > > --
> > > > 2.30.2
> > > >
> > >
> > >
> > > --
> > > -Regards
> > > Devesh
> >
> >
> 
> 
> -- 
> -Regards
> Devesh
Jason Gunthorpe April 8, 2021, 11:53 a.m. UTC | #5
On Thu, Apr 08, 2021 at 02:44:45PM +0300, Leon Romanovsky wrote:

> > In my internal testing, I am seeing a crash using the 3rd patch. I am
> > spending a few cycles on debugging it. expect my input in a day or so.
> 
> Can you please post the kernel crash report here?
> I don't see how function rename in patch #3 can cause to the crash.

I looked too, I'm also quite surprised that 1,2,3 alone have a
bug.. Is there some condition where ulp_probe can be null?

Ugh the is_bnxt_re_dev() is horribly gross too

Jason
Leon Romanovsky April 8, 2021, 12:03 p.m. UTC | #6
On Thu, Apr 08, 2021 at 08:53:47AM -0300, Jason Gunthorpe wrote:
> On Thu, Apr 08, 2021 at 02:44:45PM +0300, Leon Romanovsky wrote:
> 
> > > In my internal testing, I am seeing a crash using the 3rd patch. I am
> > > spending a few cycles on debugging it. expect my input in a day or so.
> > 
> > Can you please post the kernel crash report here?
> > I don't see how function rename in patch #3 can cause to the crash.
> 
> I looked too, I'm also quite surprised that 1,2,3 alone have a
> bug.. Is there some condition where ulp_probe can be null?

My speculative guess that they are testing not upstream kernel/module.

> 
> Ugh the is_bnxt_re_dev() is horribly gross too

The whole bnxt* code is very creative. The function bnxt_re_from_netdev()
below is junk too.

It is interesting to see how my review skills from 2017 improved over years :).

Thanks

> 
> Jason
Devesh Sharma April 8, 2021, 3:12 p.m. UTC | #7
On Thu, Apr 8, 2021 at 5:14 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Thu, Apr 08, 2021 at 05:06:24PM +0530, Devesh Sharma wrote:
> > On Sat, Apr 3, 2021 at 5:12 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Sat, Apr 03, 2021 at 03:52:13PM +0530, Devesh Sharma wrote:
> > > > On Thu, Apr 1, 2021 at 12:27 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > >
> > > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > > >
> > > > > Changelog:
> > > > > v2:
> > > > >  * kbuild spotted that I didn't delete all code in patch #5, so deleted
> > > > >    even more ulp_ops derefences.
> > > > > v1: https://lore.kernel.org/linux-rdma/20210329085212.257771-1-leon@kernel.org
> > > > >  * Go much deeper and removed useless ULP indirection
> > > > > v0: https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > > -----------------------------------------------------------------------
> > > > >
> > > > > The following series fixes issue spotted in [1], where bnxt_re driver
> > > > > messed with module reference counting in order to implement symbol
> > > > > dependency of bnxt_re and bnxt modules. All of this is done, when in
> > > > > upstream we have only one ULP user of that bnxt module. The simple
> > > > > declaration of exported symbol would do the trick.
> > > > >
> > > > > This series removes that custom module_get/_put, which is not supposed
> > > > > to be in the driver from the beginning and get rid of nasty indirection
> > > > > logic that isn't relevant for the upstream code.
> > > > >
> > > > > Such small changes allow us to simplify the bnxt code and my hope that
> > > > > Devesh will continue where I stopped and remove struct bnxt_ulp_ops too.
> > > > >
> > > > > Thanks
> > > > >
> > > > > [1] https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > >
> > > > > Leon Romanovsky (5):
> > > > >   RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select it
> > > > >   RDMA/bnxt_re: Create direct symbolic link between bnxt modules
> > > > >   RDMA/bnxt_re: Get rid of custom module reference counting
> > > > >   net/bnxt: Remove useless check of non-existent ULP id
> > > > >   net/bnxt: Use direct API instead of useless indirection
> > > > >
> > > > >  drivers/infiniband/hw/bnxt_re/Kconfig         |   4 +-
> > > > >  drivers/infiniband/hw/bnxt_re/main.c          |  93 ++-----
> > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   4 +-
> > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
> > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 245 +++++++-----------
> > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h |  32 +--
> > > > >  6 files changed, 119 insertions(+), 260 deletions(-)
> > > >
> > > > Hi Leon,
> > > >
> > > > After a couple of internal discussions we reached a conclusion to
> > > > implement the Auxbus driver interface and fix the problem once and for
> > > > all.
> > >
> > > Thanks Devesh,
> > >
> > > Jason, it looks like we can proceed with this patchset, because in
> > > auxbus mode this module refcount and ULP indirection logics will be
> > > removed anyway.
> > >
> > > Thanks
> > Hi Leon,
> >
> > In my internal testing, I am seeing a crash using the 3rd patch. I am
> > spending a few cycles on debugging it. expect my input in a day or so.
>
> Can you please post the kernel crash report here?
> I don't see how function rename in patch #3 can cause to the crash.
Hey, unfortunately my kdump service config is giving me tough time on
my host. I will share if I get it.
>
> Thanks
>
> > >
> > > > >
> > > > > --
> > > > > 2.30.2
> > > > >
> > > >
> > > >
> > > > --
> > > > -Regards
> > > > Devesh
> > >
> > >
> >
> >
> > --
> > -Regards
> > Devesh
>
>
Devesh Sharma April 8, 2021, 3:21 p.m. UTC | #8
On Thu, Apr 8, 2021 at 5:23 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Thu, Apr 08, 2021 at 02:44:45PM +0300, Leon Romanovsky wrote:
>
> > > In my internal testing, I am seeing a crash using the 3rd patch. I am
> > > spending a few cycles on debugging it. expect my input in a day or so.
> >
> > Can you please post the kernel crash report here?
> > I don't see how function rename in patch #3 can cause to the crash.
>
> I looked too, I'm also quite surprised that 1,2,3 alone have a
> bug.. Is there some condition where ulp_probe can be null?
>
> Ugh the is_bnxt_re_dev() is horribly gross too
Yeah, it is indeed. I will take this feedback to the internal team.
>
> Jason
Leon Romanovsky April 12, 2021, 7:40 a.m. UTC | #9
On Thu, Apr 08, 2021 at 08:42:57PM +0530, Devesh Sharma wrote:
> On Thu, Apr 8, 2021 at 5:14 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Thu, Apr 08, 2021 at 05:06:24PM +0530, Devesh Sharma wrote:
> > > On Sat, Apr 3, 2021 at 5:12 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > >
> > > > On Sat, Apr 03, 2021 at 03:52:13PM +0530, Devesh Sharma wrote:
> > > > > On Thu, Apr 1, 2021 at 12:27 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > >
> > > > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > > > >
> > > > > > Changelog:
> > > > > > v2:
> > > > > >  * kbuild spotted that I didn't delete all code in patch #5, so deleted
> > > > > >    even more ulp_ops derefences.
> > > > > > v1: https://lore.kernel.org/linux-rdma/20210329085212.257771-1-leon@kernel.org
> > > > > >  * Go much deeper and removed useless ULP indirection
> > > > > > v0: https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > > > -----------------------------------------------------------------------
> > > > > >
> > > > > > The following series fixes issue spotted in [1], where bnxt_re driver
> > > > > > messed with module reference counting in order to implement symbol
> > > > > > dependency of bnxt_re and bnxt modules. All of this is done, when in
> > > > > > upstream we have only one ULP user of that bnxt module. The simple
> > > > > > declaration of exported symbol would do the trick.
> > > > > >
> > > > > > This series removes that custom module_get/_put, which is not supposed
> > > > > > to be in the driver from the beginning and get rid of nasty indirection
> > > > > > logic that isn't relevant for the upstream code.
> > > > > >
> > > > > > Such small changes allow us to simplify the bnxt code and my hope that
> > > > > > Devesh will continue where I stopped and remove struct bnxt_ulp_ops too.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > [1] https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > > >
> > > > > > Leon Romanovsky (5):
> > > > > >   RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select it
> > > > > >   RDMA/bnxt_re: Create direct symbolic link between bnxt modules
> > > > > >   RDMA/bnxt_re: Get rid of custom module reference counting
> > > > > >   net/bnxt: Remove useless check of non-existent ULP id
> > > > > >   net/bnxt: Use direct API instead of useless indirection
> > > > > >
> > > > > >  drivers/infiniband/hw/bnxt_re/Kconfig         |   4 +-
> > > > > >  drivers/infiniband/hw/bnxt_re/main.c          |  93 ++-----
> > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   4 +-
> > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
> > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 245 +++++++-----------
> > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h |  32 +--
> > > > > >  6 files changed, 119 insertions(+), 260 deletions(-)
> > > > >
> > > > > Hi Leon,
> > > > >
> > > > > After a couple of internal discussions we reached a conclusion to
> > > > > implement the Auxbus driver interface and fix the problem once and for
> > > > > all.
> > > >
> > > > Thanks Devesh,
> > > >
> > > > Jason, it looks like we can proceed with this patchset, because in
> > > > auxbus mode this module refcount and ULP indirection logics will be
> > > > removed anyway.
> > > >
> > > > Thanks
> > > Hi Leon,
> > >
> > > In my internal testing, I am seeing a crash using the 3rd patch. I am
> > > spending a few cycles on debugging it. expect my input in a day or so.
> >
> > Can you please post the kernel crash report here?
> > I don't see how function rename in patch #3 can cause to the crash.
> Hey, unfortunately my kdump service config is giving me tough time on
> my host. I will share if I get it.

Any news here?

> >
> > Thanks
> >
> > > >
> > > > > >
> > > > > > --
> > > > > > 2.30.2
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -Regards
> > > > > Devesh
> > > >
> > > >
> > >
> > >
> > > --
> > > -Regards
> > > Devesh
> >
> >
> 
> 
> -- 
> -Regards
> Devesh
Devesh Sharma April 14, 2021, 1:45 p.m. UTC | #10
On Mon, Apr 12, 2021 at 1:10 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Thu, Apr 08, 2021 at 08:42:57PM +0530, Devesh Sharma wrote:
> > On Thu, Apr 8, 2021 at 5:14 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Thu, Apr 08, 2021 at 05:06:24PM +0530, Devesh Sharma wrote:
> > > > On Sat, Apr 3, 2021 at 5:12 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > >
> > > > > On Sat, Apr 03, 2021 at 03:52:13PM +0530, Devesh Sharma wrote:
> > > > > > On Thu, Apr 1, 2021 at 12:27 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > > >
> > > > > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > > > > >
> > > > > > > Changelog:
> > > > > > > v2:
> > > > > > >  * kbuild spotted that I didn't delete all code in patch #5, so deleted
> > > > > > >    even more ulp_ops derefences.
> > > > > > > v1: https://lore.kernel.org/linux-rdma/20210329085212.257771-1-leon@kernel.org
> > > > > > >  * Go much deeper and removed useless ULP indirection
> > > > > > > v0: https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > > > > -----------------------------------------------------------------------
> > > > > > >
> > > > > > > The following series fixes issue spotted in [1], where bnxt_re driver
> > > > > > > messed with module reference counting in order to implement symbol
> > > > > > > dependency of bnxt_re and bnxt modules. All of this is done, when in
> > > > > > > upstream we have only one ULP user of that bnxt module. The simple
> > > > > > > declaration of exported symbol would do the trick.
> > > > > > >
> > > > > > > This series removes that custom module_get/_put, which is not supposed
> > > > > > > to be in the driver from the beginning and get rid of nasty indirection
> > > > > > > logic that isn't relevant for the upstream code.
> > > > > > >
> > > > > > > Such small changes allow us to simplify the bnxt code and my hope that
> > > > > > > Devesh will continue where I stopped and remove struct bnxt_ulp_ops too.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > [1] https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > > > >
> > > > > > > Leon Romanovsky (5):
> > > > > > >   RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select it
> > > > > > >   RDMA/bnxt_re: Create direct symbolic link between bnxt modules
> > > > > > >   RDMA/bnxt_re: Get rid of custom module reference counting
> > > > > > >   net/bnxt: Remove useless check of non-existent ULP id
> > > > > > >   net/bnxt: Use direct API instead of useless indirection
> > > > > > >
> > > > > > >  drivers/infiniband/hw/bnxt_re/Kconfig         |   4 +-
> > > > > > >  drivers/infiniband/hw/bnxt_re/main.c          |  93 ++-----
> > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   4 +-
> > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
> > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 245 +++++++-----------
> > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h |  32 +--
> > > > > > >  6 files changed, 119 insertions(+), 260 deletions(-)
> > > > > >
> > > > > > Hi Leon,
> > > > > >
> > > > > > After a couple of internal discussions we reached a conclusion to
> > > > > > implement the Auxbus driver interface and fix the problem once and for
> > > > > > all.
> > > > >
> > > > > Thanks Devesh,
> > > > >
> > > > > Jason, it looks like we can proceed with this patchset, because in
> > > > > auxbus mode this module refcount and ULP indirection logics will be
> > > > > removed anyway.
> > > > >
> > > > > Thanks
> > > > Hi Leon,
> > > >
> > > > In my internal testing, I am seeing a crash using the 3rd patch. I am
> > > > spending a few cycles on debugging it. expect my input in a day or so.
> > >
> > > Can you please post the kernel crash report here?
> > > I don't see how function rename in patch #3 can cause to the crash.
> > Hey, unfortunately my kdump service config is giving me tough time on
> > my host. I will share if I get it.
>
> Any news here?
Expect something by this Friday. yesterday was a holiday in India.
>
> > >
> > > Thanks
> > >
> > > > >
> > > > > > >
> > > > > > > --
> > > > > > > 2.30.2
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > -Regards
> > > > > > Devesh
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > -Regards
> > > > Devesh
> > >
> > >
> >
> >
> > --
> > -Regards
> > Devesh
>
>
Leon Romanovsky April 17, 2021, 8:14 a.m. UTC | #11
On Wed, Apr 14, 2021 at 07:15:37PM +0530, Devesh Sharma wrote:
> On Mon, Apr 12, 2021 at 1:10 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Thu, Apr 08, 2021 at 08:42:57PM +0530, Devesh Sharma wrote:
> > > On Thu, Apr 8, 2021 at 5:14 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > >
> > > > On Thu, Apr 08, 2021 at 05:06:24PM +0530, Devesh Sharma wrote:
> > > > > On Sat, Apr 3, 2021 at 5:12 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > >
> > > > > > On Sat, Apr 03, 2021 at 03:52:13PM +0530, Devesh Sharma wrote:
> > > > > > > On Thu, Apr 1, 2021 at 12:27 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > > > >
> > > > > > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > > > > > >
> > > > > > > > Changelog:
> > > > > > > > v2:
> > > > > > > >  * kbuild spotted that I didn't delete all code in patch #5, so deleted
> > > > > > > >    even more ulp_ops derefences.
> > > > > > > > v1: https://lore.kernel.org/linux-rdma/20210329085212.257771-1-leon@kernel.org
> > > > > > > >  * Go much deeper and removed useless ULP indirection
> > > > > > > > v0: https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > > > > > -----------------------------------------------------------------------
> > > > > > > >
> > > > > > > > The following series fixes issue spotted in [1], where bnxt_re driver
> > > > > > > > messed with module reference counting in order to implement symbol
> > > > > > > > dependency of bnxt_re and bnxt modules. All of this is done, when in
> > > > > > > > upstream we have only one ULP user of that bnxt module. The simple
> > > > > > > > declaration of exported symbol would do the trick.
> > > > > > > >
> > > > > > > > This series removes that custom module_get/_put, which is not supposed
> > > > > > > > to be in the driver from the beginning and get rid of nasty indirection
> > > > > > > > logic that isn't relevant for the upstream code.
> > > > > > > >
> > > > > > > > Such small changes allow us to simplify the bnxt code and my hope that
> > > > > > > > Devesh will continue where I stopped and remove struct bnxt_ulp_ops too.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > [1] https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > > > > >
> > > > > > > > Leon Romanovsky (5):
> > > > > > > >   RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select it
> > > > > > > >   RDMA/bnxt_re: Create direct symbolic link between bnxt modules
> > > > > > > >   RDMA/bnxt_re: Get rid of custom module reference counting
> > > > > > > >   net/bnxt: Remove useless check of non-existent ULP id
> > > > > > > >   net/bnxt: Use direct API instead of useless indirection
> > > > > > > >
> > > > > > > >  drivers/infiniband/hw/bnxt_re/Kconfig         |   4 +-
> > > > > > > >  drivers/infiniband/hw/bnxt_re/main.c          |  93 ++-----
> > > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   4 +-
> > > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
> > > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 245 +++++++-----------
> > > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h |  32 +--
> > > > > > > >  6 files changed, 119 insertions(+), 260 deletions(-)
> > > > > > >
> > > > > > > Hi Leon,
> > > > > > >
> > > > > > > After a couple of internal discussions we reached a conclusion to
> > > > > > > implement the Auxbus driver interface and fix the problem once and for
> > > > > > > all.
> > > > > >
> > > > > > Thanks Devesh,
> > > > > >
> > > > > > Jason, it looks like we can proceed with this patchset, because in
> > > > > > auxbus mode this module refcount and ULP indirection logics will be
> > > > > > removed anyway.
> > > > > >
> > > > > > Thanks
> > > > > Hi Leon,
> > > > >
> > > > > In my internal testing, I am seeing a crash using the 3rd patch. I am
> > > > > spending a few cycles on debugging it. expect my input in a day or so.
> > > >
> > > > Can you please post the kernel crash report here?
> > > > I don't see how function rename in patch #3 can cause to the crash.
> > > Hey, unfortunately my kdump service config is giving me tough time on
> > > my host. I will share if I get it.
> >
> > Any news here?
> Expect something by this Friday. yesterday was a holiday in India.

Any update? 
This series is close to three weeks already and I would like to progress with it.

Thanks
Devesh Sharma April 17, 2021, 6:39 p.m. UTC | #12
On Sat, Apr 17, 2021 at 1:44 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Wed, Apr 14, 2021 at 07:15:37PM +0530, Devesh Sharma wrote:
> > On Mon, Apr 12, 2021 at 1:10 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Thu, Apr 08, 2021 at 08:42:57PM +0530, Devesh Sharma wrote:
> > > > On Thu, Apr 8, 2021 at 5:14 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > >
> > > > > On Thu, Apr 08, 2021 at 05:06:24PM +0530, Devesh Sharma wrote:
> > > > > > On Sat, Apr 3, 2021 at 5:12 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > > >
> > > > > > > On Sat, Apr 03, 2021 at 03:52:13PM +0530, Devesh Sharma wrote:
> > > > > > > > On Thu, Apr 1, 2021 at 12:27 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > > > > > > >
> > > > > > > > > Changelog:
> > > > > > > > > v2:
> > > > > > > > >  * kbuild spotted that I didn't delete all code in patch #5, so deleted
> > > > > > > > >    even more ulp_ops derefences.
> > > > > > > > > v1: https://lore.kernel.org/linux-rdma/20210329085212.257771-1-leon@kernel.org
> > > > > > > > >  * Go much deeper and removed useless ULP indirection
> > > > > > > > > v0: https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > > > > > > -----------------------------------------------------------------------
> > > > > > > > >
> > > > > > > > > The following series fixes issue spotted in [1], where bnxt_re driver
> > > > > > > > > messed with module reference counting in order to implement symbol
> > > > > > > > > dependency of bnxt_re and bnxt modules. All of this is done, when in
> > > > > > > > > upstream we have only one ULP user of that bnxt module. The simple
> > > > > > > > > declaration of exported symbol would do the trick.
> > > > > > > > >
> > > > > > > > > This series removes that custom module_get/_put, which is not supposed
> > > > > > > > > to be in the driver from the beginning and get rid of nasty indirection
> > > > > > > > > logic that isn't relevant for the upstream code.
> > > > > > > > >
> > > > > > > > > Such small changes allow us to simplify the bnxt code and my hope that
> > > > > > > > > Devesh will continue where I stopped and remove struct bnxt_ulp_ops too.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > [1] https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > > > > > >
> > > > > > > > > Leon Romanovsky (5):
> > > > > > > > >   RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select it
> > > > > > > > >   RDMA/bnxt_re: Create direct symbolic link between bnxt modules
> > > > > > > > >   RDMA/bnxt_re: Get rid of custom module reference counting
> > > > > > > > >   net/bnxt: Remove useless check of non-existent ULP id
> > > > > > > > >   net/bnxt: Use direct API instead of useless indirection
> > > > > > > > >
> > > > > > > > >  drivers/infiniband/hw/bnxt_re/Kconfig         |   4 +-
> > > > > > > > >  drivers/infiniband/hw/bnxt_re/main.c          |  93 ++-----
> > > > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   4 +-
> > > > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
> > > > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 245 +++++++-----------
> > > > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h |  32 +--
> > > > > > > > >  6 files changed, 119 insertions(+), 260 deletions(-)
> > > > > > > >
> > > > > > > > Hi Leon,
> > > > > > > >
> > > > > > > > After a couple of internal discussions we reached a conclusion to
> > > > > > > > implement the Auxbus driver interface and fix the problem once and for
> > > > > > > > all.
> > > > > > >
> > > > > > > Thanks Devesh,
> > > > > > >
> > > > > > > Jason, it looks like we can proceed with this patchset, because in
> > > > > > > auxbus mode this module refcount and ULP indirection logics will be
> > > > > > > removed anyway.
> > > > > > >
> > > > > > > Thanks
> > > > > > Hi Leon,
> > > > > >
> > > > > > In my internal testing, I am seeing a crash using the 3rd patch. I am
> > > > > > spending a few cycles on debugging it. expect my input in a day or so.
> > > > >
> > > > > Can you please post the kernel crash report here?
> > > > > I don't see how function rename in patch #3 can cause to the crash.
> > > > Hey, unfortunately my kdump service config is giving me tough time on
> > > > my host. I will share if I get it.
> > >
> > > Any news here?
> > Expect something by this Friday. yesterday was a holiday in India.
>
> Any update?
> This series is close to three weeks already and I would like to progress with it.
Hi Leon,

The host crash I indicated earlier is actually caused by patch 4 and
not by patch 3 from this series. I spent time to root cause the
problem and realized that patch-4 is touching quite many areas which
would require much intrusive testing and validation.
As I indicated earlier, we are implementing the PCI Aux driver
interface at a faster pace. While PCI Aux changes are in progress we
are willing to retain the existing bnxt_re and bnxt_en interface
untouched.
The problem of module referencing would be rectified with PCI aux
change by inheritance.
>
> Thanks




--
-Regards
Devesh
Leon Romanovsky April 18, 2021, 4:18 a.m. UTC | #13
On Sun, Apr 18, 2021 at 12:09:16AM +0530, Devesh Sharma wrote:
> On Sat, Apr 17, 2021 at 1:44 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Wed, Apr 14, 2021 at 07:15:37PM +0530, Devesh Sharma wrote:
> > > On Mon, Apr 12, 2021 at 1:10 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > >
> > > > On Thu, Apr 08, 2021 at 08:42:57PM +0530, Devesh Sharma wrote:
> > > > > On Thu, Apr 8, 2021 at 5:14 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > >
> > > > > > On Thu, Apr 08, 2021 at 05:06:24PM +0530, Devesh Sharma wrote:
> > > > > > > On Sat, Apr 3, 2021 at 5:12 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Sat, Apr 03, 2021 at 03:52:13PM +0530, Devesh Sharma wrote:
> > > > > > > > > On Thu, Apr 1, 2021 at 12:27 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > > > > > > > >
> > > > > > > > > > Changelog:
> > > > > > > > > > v2:
> > > > > > > > > >  * kbuild spotted that I didn't delete all code in patch #5, so deleted
> > > > > > > > > >    even more ulp_ops derefences.
> > > > > > > > > > v1: https://lore.kernel.org/linux-rdma/20210329085212.257771-1-leon@kernel.org
> > > > > > > > > >  * Go much deeper and removed useless ULP indirection
> > > > > > > > > > v0: https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > > > > > > > -----------------------------------------------------------------------
> > > > > > > > > >
> > > > > > > > > > The following series fixes issue spotted in [1], where bnxt_re driver
> > > > > > > > > > messed with module reference counting in order to implement symbol
> > > > > > > > > > dependency of bnxt_re and bnxt modules. All of this is done, when in
> > > > > > > > > > upstream we have only one ULP user of that bnxt module. The simple
> > > > > > > > > > declaration of exported symbol would do the trick.
> > > > > > > > > >
> > > > > > > > > > This series removes that custom module_get/_put, which is not supposed
> > > > > > > > > > to be in the driver from the beginning and get rid of nasty indirection
> > > > > > > > > > logic that isn't relevant for the upstream code.
> > > > > > > > > >
> > > > > > > > > > Such small changes allow us to simplify the bnxt code and my hope that
> > > > > > > > > > Devesh will continue where I stopped and remove struct bnxt_ulp_ops too.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > [1] https://lore.kernel.org/linux-rdma/20210324142524.1135319-1-leon@kernel.org
> > > > > > > > > >
> > > > > > > > > > Leon Romanovsky (5):
> > > > > > > > > >   RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select it
> > > > > > > > > >   RDMA/bnxt_re: Create direct symbolic link between bnxt modules
> > > > > > > > > >   RDMA/bnxt_re: Get rid of custom module reference counting
> > > > > > > > > >   net/bnxt: Remove useless check of non-existent ULP id
> > > > > > > > > >   net/bnxt: Use direct API instead of useless indirection
> > > > > > > > > >
> > > > > > > > > >  drivers/infiniband/hw/bnxt_re/Kconfig         |   4 +-
> > > > > > > > > >  drivers/infiniband/hw/bnxt_re/main.c          |  93 ++-----
> > > > > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   4 +-
> > > > > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
> > > > > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 245 +++++++-----------
> > > > > > > > > >  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h |  32 +--
> > > > > > > > > >  6 files changed, 119 insertions(+), 260 deletions(-)
> > > > > > > > >
> > > > > > > > > Hi Leon,
> > > > > > > > >
> > > > > > > > > After a couple of internal discussions we reached a conclusion to
> > > > > > > > > implement the Auxbus driver interface and fix the problem once and for
> > > > > > > > > all.
> > > > > > > >
> > > > > > > > Thanks Devesh,
> > > > > > > >
> > > > > > > > Jason, it looks like we can proceed with this patchset, because in
> > > > > > > > auxbus mode this module refcount and ULP indirection logics will be
> > > > > > > > removed anyway.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > Hi Leon,
> > > > > > >
> > > > > > > In my internal testing, I am seeing a crash using the 3rd patch. I am
> > > > > > > spending a few cycles on debugging it. expect my input in a day or so.
> > > > > >
> > > > > > Can you please post the kernel crash report here?
> > > > > > I don't see how function rename in patch #3 can cause to the crash.
> > > > > Hey, unfortunately my kdump service config is giving me tough time on
> > > > > my host. I will share if I get it.
> > > >
> > > > Any news here?
> > > Expect something by this Friday. yesterday was a holiday in India.
> >
> > Any update?
> > This series is close to three weeks already and I would like to progress with it.
> Hi Leon,
> 
> The host crash I indicated earlier is actually caused by patch 4 and
> not by patch 3 from this series. I spent time to root cause the
> problem and realized that patch-4 is touching quite many areas which
> would require much intrusive testing and validation.
> As I indicated earlier, we are implementing the PCI Aux driver
> interface at a faster pace. While PCI Aux changes are in progress we
> are willing to retain the existing bnxt_re and bnxt_en interface
> untouched.
> The problem of module referencing would be rectified with PCI aux
> change by inheritance.

Sorry no, the first three patches are not controversial and better to be
applied now. They do the right thing and they are correct.

There is a little trust in your promises above after you didn't show us kernel
panic despite our numerous requests. I also very sceptical in Broadcom ability
to provide auxbus implementation in timely manner.

It is worth to mention that auxbus won't eliminate the patches #4 and #5, but
will embed them into your auxbus conversion.

Jason, please take first three patches so internal HW IB driver won't do the crazy
module management that is totally out of scope for drivers/infiniband and not needed.

Thanks

> >
> > Thanks
> 
> 
> 
> 
> --
> -Regards
> Devesh
Jason Gunthorpe April 19, 2021, 5:38 p.m. UTC | #14
On Sun, Apr 18, 2021 at 12:09:16AM +0530, Devesh Sharma wrote:

> The host crash I indicated earlier is actually caused by patch 4 and
> not by patch 3 from this series. I spent time to root cause the

This makes a lot more sense.

The ulp_id stuff does need to go away as well though.

> problem and realized that patch-4 is touching quite many areas which
> would require much intrusive testing and validation.
> As I indicated earlier, we are implementing the PCI Aux driver
> interface at a faster pace.

Doing an aux driver doesn't mean you get to keep all these single
implementation function pointers - see the discussion around Intel's
patches.

> The problem of module referencing would be rectified with PCI aux
> change by inheritance.

The first three patches are clearly an improvement, and quite trivial,
so I'm going to take them.

Jason
Devesh Sharma April 19, 2021, 7:04 p.m. UTC | #15
On Mon, Apr 19, 2021 at 11:08 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Sun, Apr 18, 2021 at 12:09:16AM +0530, Devesh Sharma wrote:
>
> > The host crash I indicated earlier is actually caused by patch 4 and
> > not by patch 3 from this series. I spent time to root cause the
>
> This makes a lot more sense.
>
> The ulp_id stuff does need to go away as well though.
We shall address this concern with Aux implementation.
>
> > problem and realized that patch-4 is touching quite many areas which
> > would require much intrusive testing and validation.
> > As I indicated earlier, we are implementing the PCI Aux driver
> > interface at a faster pace.
>
> Doing an aux driver doesn't mean you get to keep all these single
> implementation function pointers - see the discussion around Intel's
> patches.
Sure, We will try addressing this concern as well. Could you please
point me to the exact patches please...
>
> > The problem of module referencing would be rectified with PCI aux
> > change by inheritance.
>
> The first three patches are clearly an improvement, and quite trivial,
> so I'm going to take them.
To my mind the first 3 patches in this series are fine and I agree to
Ack those. Of-course I want to spend some more time
establishing all of our internal test harness  passing. I can Ack
those in a couple of days at the earliest.
Further, I do not want to proceed with patch 4 and 5 as those patches
are too intrusive and cause host hang/crash during rmmod bnxt_re
followed by rmmod bnxt_en.
Probably this is some kind of race. If I try to add a few prints to
the debug, the problem does not appear.
Since, we want to remain focused on PCI Aux implementation instead of
resolving instabilities at this point, we would like to pick up the
idea from 4 and 5 in our PCI Aux implementation.

>
> Jason