diff mbox series

[v4] IB/hfi1: allocate dummy net_device dynamically

Message ID 20240319090944.2021309-1-leitao@debian.org (mailing list archive)
State Superseded
Headers show
Series [v4] IB/hfi1: allocate dummy net_device dynamically | expand

Commit Message

Breno Leitao March 19, 2024, 9:09 a.m. UTC
Embedding net_device into structures prohibits the usage of flexible
arrays in the net_device structure. For more details, see the discussion
at [1].

Un-embed the net_device from struct hfi1_netdev_rx by converting it
into a pointer. Then use the leverage alloc_netdev() to allocate the
net_device object at hfi1_alloc_rx().

[1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/

Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>

---
Changelog

v2:
	* Free struct hfi1_netdev_rx allocation if alloc_netdev() fails
	* Pass zero as the private size for alloc_netdev().
	* Remove wrong reference for iwl in the comments

v3:
	* Re-worded the comment, by removing the first paragraph.

v4:
	* Fix the changelog format
---
 drivers/infiniband/hw/hfi1/netdev.h    |  2 +-
 drivers/infiniband/hw/hfi1/netdev_rx.c | 10 ++++++++--
 2 files changed, 9 insertions(+), 3 deletions(-)

Comments

Leon Romanovsky April 1, 2024, 11:53 a.m. UTC | #1
On Tue, Mar 19, 2024 at 02:09:43AM -0700, Breno Leitao wrote:
> Embedding net_device into structures prohibits the usage of flexible
> arrays in the net_device structure. For more details, see the discussion
> at [1].
> 
> Un-embed the net_device from struct hfi1_netdev_rx by converting it
> into a pointer. Then use the leverage alloc_netdev() to allocate the
> net_device object at hfi1_alloc_rx().
> 
> [1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>

Jakub,

I create shared branch for you, please pull it from:
https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=remove-dummy-netdev

Thanks
Leon Romanovsky April 1, 2024, 11:53 a.m. UTC | #2
On Tue, 19 Mar 2024 02:09:43 -0700, Breno Leitao wrote:
> Embedding net_device into structures prohibits the usage of flexible
> arrays in the net_device structure. For more details, see the discussion
> at [1].
> 
> Un-embed the net_device from struct hfi1_netdev_rx by converting it
> into a pointer. Then use the leverage alloc_netdev() to allocate the
> net_device object at hfi1_alloc_rx().
> 
> [...]

Applied, thanks!

[1/1] IB/hfi1: allocate dummy net_device dynamically
      https://git.kernel.org/rdma/rdma/c/c965b039a750c4

Best regards,
Jakub Kicinski April 1, 2024, 2:53 p.m. UTC | #3
On Mon, 1 Apr 2024 14:53:31 +0300 Leon Romanovsky wrote:
> On Tue, Mar 19, 2024 at 02:09:43AM -0700, Breno Leitao wrote:
> > Embedding net_device into structures prohibits the usage of flexible
> > arrays in the net_device structure. For more details, see the discussion
> > at [1].
> > 
> > Un-embed the net_device from struct hfi1_netdev_rx by converting it
> > into a pointer. Then use the leverage alloc_netdev() to allocate the
> > net_device object at hfi1_alloc_rx().
> > 
> > [1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/
> > 
> > Signed-off-by: Breno Leitao <leitao@debian.org>
> > Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>  
> 
> Jakub,
> 
> I create shared branch for you, please pull it from:
> https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=remove-dummy-netdev

Did you merge it in already?
Turned out that the use of init_dummy_netdev as a setup function
is broken, I'm not sure how Dennis tested this :(
We should have pinged you, sorry.
Dennis Dalessandro April 1, 2024, 3:34 p.m. UTC | #4
On 4/1/24 10:53 AM, Jakub Kicinski wrote:
> On Mon, 1 Apr 2024 14:53:31 +0300 Leon Romanovsky wrote:
>> On Tue, Mar 19, 2024 at 02:09:43AM -0700, Breno Leitao wrote:
>>> Embedding net_device into structures prohibits the usage of flexible
>>> arrays in the net_device structure. For more details, see the discussion
>>> at [1].
>>>
>>> Un-embed the net_device from struct hfi1_netdev_rx by converting it
>>> into a pointer. Then use the leverage alloc_netdev() to allocate the
>>> net_device object at hfi1_alloc_rx().
>>>
>>> [1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/
>>>
>>> Signed-off-by: Breno Leitao <leitao@debian.org>
>>> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>  
>>
>> Jakub,
>>
>> I create shared branch for you, please pull it from:
>> https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=remove-dummy-netdev
> 
> Did you merge it in already?
> Turned out that the use of init_dummy_netdev as a setup function
> is broken, I'm not sure how Dennis tested this :(
> We should have pinged you, sorry.

This is what I tested, Linus 6.8 tag + cherry pick + Breno patch. So if
something went in that broke it I didn't have it in my tree.

commit 311810a6d7e37d8e7537d50e26197b7f5f02f164 (linus-master)
Author: Breno Leitao <leitao@debian.org>
Date:   Wed Mar 13 03:33:10 2024 -0700

    IB/hfi1: allocate dummy net_device dynamically

    struct net_device shouldn't be embedded into any structure, instead,
    the owner should use the priv space to embed their state into net_device.

    Embedding net_device into structures prohibits the usage of flexible
    arrays in the net_device structure. For more details, see the discussion
    at [1].

    Un-embed the net_device from struct hfi1_netdev_rx by converting it
    into a pointer. Then use the leverage alloc_netdev() to allocate the
    net_device object at hfi1_alloc_rx().

    [1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/

    Signed-off-by: Breno Leitao <leitao@debian.org>

    ----
    PS: this diff needs d160c66cda0ac8614 ("net: Do not return value from
    init_dummy_netdev()") in order to apply and build cleanly.

commit 1e06cffe69e6519f8ede42c60f13ad3a7ddb09b7
Author: Amit Cohen <amcohen@nvidia.com>
Date:   Mon Feb 5 12:30:22 2024 +0200

    net: Do not return value from init_dummy_netdev()

    init_dummy_netdev() always returns zero and all the callers do not check
    the returned value. Set the function to not return value, as it is not
    really used today.

    Signed-off-by: Amit Cohen <amcohen@nvidia.com>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240205103022.440946-1-amcohen@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

commit e8f897f4afef0031fe618a8e94127a0934896aba (tag: v6.8)
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Sun Mar 10 13:38:09 2024 -0700

    Linux 6.8
Leon Romanovsky April 1, 2024, 6 p.m. UTC | #5
On Mon, Apr 01, 2024 at 07:53:06AM -0700, Jakub Kicinski wrote:
> On Mon, 1 Apr 2024 14:53:31 +0300 Leon Romanovsky wrote:
> > On Tue, Mar 19, 2024 at 02:09:43AM -0700, Breno Leitao wrote:
> > > Embedding net_device into structures prohibits the usage of flexible
> > > arrays in the net_device structure. For more details, see the discussion
> > > at [1].
> > > 
> > > Un-embed the net_device from struct hfi1_netdev_rx by converting it
> > > into a pointer. Then use the leverage alloc_netdev() to allocate the
> > > net_device object at hfi1_alloc_rx().
> > > 
> > > [1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/
> > > 
> > > Signed-off-by: Breno Leitao <leitao@debian.org>
> > > Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>  
> > 
> > Jakub,
> > 
> > I create shared branch for you, please pull it from:
> > https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=remove-dummy-netdev
> 
> Did you merge it in already?

I merged it into testing branch and dropped it now.

Thanks
Breno Leitao April 3, 2024, 12:15 p.m. UTC | #6
On Mon, Apr 01, 2024 at 11:34:23AM -0400, Dennis Dalessandro wrote:
> On 4/1/24 10:53 AM, Jakub Kicinski wrote:
> > On Mon, 1 Apr 2024 14:53:31 +0300 Leon Romanovsky wrote:
> >> On Tue, Mar 19, 2024 at 02:09:43AM -0700, Breno Leitao wrote:
> >>> Embedding net_device into structures prohibits the usage of flexible
> >>> arrays in the net_device structure. For more details, see the discussion
> >>> at [1].
> >>>
> >>> Un-embed the net_device from struct hfi1_netdev_rx by converting it
> >>> into a pointer. Then use the leverage alloc_netdev() to allocate the
> >>> net_device object at hfi1_alloc_rx().
> >>>
> >>> [1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/
> >>>
> >>> Signed-off-by: Breno Leitao <leitao@debian.org>
> >>> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>  
> >>
> >> Jakub,
> >>
> >> I create shared branch for you, please pull it from:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=remove-dummy-netdev
> > 
> > Did you merge it in already?
> > Turned out that the use of init_dummy_netdev as a setup function
> > is broken, I'm not sure how Dennis tested this :(
> > We should have pinged you, sorry.
> 
> This is what I tested, Linus 6.8 tag + cherry pick + Breno patch. So if
> something went in that broke it I didn't have it in my tree.
> 
> commit 311810a6d7e37d8e7537d50e26197b7f5f02f164 (linus-master)
> Author: Breno Leitao <leitao@debian.org>
> Date:   Wed Mar 13 03:33:10 2024 -0700
> 
>     IB/hfi1: allocate dummy net_device dynamically

This one has a potential bug that causes a kernel panic when the module
is removed.

This is because alloc_netdev() allocates some data structures that are
later overwritten (memset) by init_dummy_netdev(). At the free time,
free_netdev() will dereference those structures and they are zero.

A new upcoming patch is creating a helper (init_dummy_netdev()) that
will allocate the netdev and call a special version of
init_dummy_netdev() without memsetting the structure.

I would drop this patch for now, and I will submit a new version using
the new helper.
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/hfi1/netdev.h b/drivers/infiniband/hw/hfi1/netdev.h
index 8aa074670a9c..07c8f77c9181 100644
--- a/drivers/infiniband/hw/hfi1/netdev.h
+++ b/drivers/infiniband/hw/hfi1/netdev.h
@@ -49,7 +49,7 @@  struct hfi1_netdev_rxq {
  *		When 0 receive queues will be freed.
  */
 struct hfi1_netdev_rx {
-	struct net_device rx_napi;
+	struct net_device *rx_napi;
 	struct hfi1_devdata *dd;
 	struct hfi1_netdev_rxq *rxq;
 	int num_rx_q;
diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c
index 720d4c85c9c9..cd6e78e257ef 100644
--- a/drivers/infiniband/hw/hfi1/netdev_rx.c
+++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
@@ -188,7 +188,7 @@  static int hfi1_netdev_rxq_init(struct hfi1_netdev_rx *rx)
 	int i;
 	int rc;
 	struct hfi1_devdata *dd = rx->dd;
-	struct net_device *dev = &rx->rx_napi;
+	struct net_device *dev = rx->rx_napi;
 
 	rx->num_rx_q = dd->num_netdev_contexts;
 	rx->rxq = kcalloc_node(rx->num_rx_q, sizeof(*rx->rxq),
@@ -360,7 +360,12 @@  int hfi1_alloc_rx(struct hfi1_devdata *dd)
 	if (!rx)
 		return -ENOMEM;
 	rx->dd = dd;
-	init_dummy_netdev(&rx->rx_napi);
+	rx->rx_napi = alloc_netdev(0, "dummy", NET_NAME_UNKNOWN,
+				   init_dummy_netdev);
+	if (!rx->rx_napi) {
+		kfree(rx);
+		return -ENOMEM;
+	}
 
 	xa_init(&rx->dev_tbl);
 	atomic_set(&rx->enabled, 0);
@@ -374,6 +379,7 @@  void hfi1_free_rx(struct hfi1_devdata *dd)
 {
 	if (dd->netdev_rx) {
 		dd_dev_info(dd, "hfi1 rx freed\n");
+		free_netdev(dd->netdev_rx->rx_napi);
 		kfree(dd->netdev_rx);
 		dd->netdev_rx = NULL;
 	}