Message ID | 6b6b07f5-250c-415e-bdc4-bd08ac69b24d@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] r8169: fix LED-related deadlock on module removal | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Clearly marked for net |
netdev/apply | success | Patch already applied to net-0 |
On Mon, Apr 15, 2024 at 08:44:35AM +0200, Heiner Kallweit wrote: > Binding devm_led_classdev_register() to the netdev is problematic > because on module removal we get a RTNL-related deadlock. More precisely the issue is triggered on driver unbind. Module unload as well as device unplug imply driver unbinding. > The original change was introduced with 6.8, 6.9 added support for > LEDs on RTL8125. Therefore the first version of the fix applied on > 6.9-rc only. This is the modified version for 6.8. I guess the recipient of this patch should have been the stable maintainers then, not netdev maintainers. > --- a/drivers/net/ethernet/realtek/r8169.h > +++ b/drivers/net/ethernet/realtek/r8169.h > @@ -72,6 +72,7 @@ enum mac_version { > }; > > struct rtl8169_private; > +struct r8169_led_classdev; Normally these forward declarations are not needed if you're just referencing the struct name in a pointer. Usage of the struct name in a pointer implies a forward declaration. > +struct r8169_led_classdev *rtl8168_init_leds(struct net_device *ndev) > { > - /* bind resource mgmt to netdev */ > - struct device *dev = &ndev->dev; > struct r8169_led_classdev *leds; > int i; > > - leds = devm_kcalloc(dev, RTL8168_NUM_LEDS, sizeof(*leds), GFP_KERNEL); > + leds = kcalloc(RTL8168_NUM_LEDS + 1, sizeof(*leds), GFP_KERNEL); > if (!leds) > - return; > + return NULL; > > for (i = 0; i < RTL8168_NUM_LEDS; i++) > rtl8168_setup_ldev(leds + i, ndev, i); > + > + return leds; > +} If registration of some LEDs fails, you seem to continue driver binding. So the leds allocation may stick around even if it's not used at all. Not a big deal, but not super pretty either. Thanks, Lukas
On 15.04.2024 10:49, Lukas Wunner wrote: > On Mon, Apr 15, 2024 at 08:44:35AM +0200, Heiner Kallweit wrote: >> Binding devm_led_classdev_register() to the netdev is problematic >> because on module removal we get a RTNL-related deadlock. > > More precisely the issue is triggered on driver unbind. > > Module unload as well as device unplug imply driver unbinding. > > >> The original change was introduced with 6.8, 6.9 added support for >> LEDs on RTL8125. Therefore the first version of the fix applied on >> 6.9-rc only. This is the modified version for 6.8. > > I guess the recipient of this patch should have been the stable > maintainers then, not netdev maintainers. > OK, seems this has changed over time. Following still mentioned that netdev is special. https://www.kernel.org/doc/html/v4.19/process/stable-kernel-rules.html > >> --- a/drivers/net/ethernet/realtek/r8169.h >> +++ b/drivers/net/ethernet/realtek/r8169.h >> @@ -72,6 +72,7 @@ enum mac_version { >> }; >> >> struct rtl8169_private; >> +struct r8169_led_classdev; > > Normally these forward declarations are not needed if you're just > referencing the struct name in a pointer. Usage of the struct name > in a pointer implies a forward declaration. > Even if technically not needed, it seems to be kernel best practice to use forward declarations, see e.g. device.h. However I'd be interested in hearing the maintainers position to consider this with the next submissions. > >> +struct r8169_led_classdev *rtl8168_init_leds(struct net_device *ndev) >> { >> - /* bind resource mgmt to netdev */ >> - struct device *dev = &ndev->dev; >> struct r8169_led_classdev *leds; >> int i; >> >> - leds = devm_kcalloc(dev, RTL8168_NUM_LEDS, sizeof(*leds), GFP_KERNEL); >> + leds = kcalloc(RTL8168_NUM_LEDS + 1, sizeof(*leds), GFP_KERNEL); >> if (!leds) >> - return; >> + return NULL; >> >> for (i = 0; i < RTL8168_NUM_LEDS; i++) >> rtl8168_setup_ldev(leds + i, ndev, i); >> + >> + return leds; >> +} > > If registration of some LEDs fails, you seem to continue driver binding. > So the leds allocation may stick around even if it's not used at all. > Not a big deal, but not super pretty either. > > Thanks, > > Lukas >
On Mon, 15 Apr 2024 10:49:11 +0200 Lukas Wunner wrote: > > struct rtl8169_private; > > +struct r8169_led_classdev; > > Normally these forward declarations are not needed if you're just > referencing the struct name in a pointer. Usage of the struct name > in a pointer implies a forward declaration. Unless something changed recently that only works for struct members, function args need an explicit forward declaration.
On Tue, Apr 16, 2024 at 04:41:13PM -0700, Jakub Kicinski wrote: > On Mon, 15 Apr 2024 10:49:11 +0200 Lukas Wunner wrote: > > > struct rtl8169_private; > > > +struct r8169_led_classdev; > > > > Normally these forward declarations are not needed if you're just > > referencing the struct name in a pointer. Usage of the struct name > > in a pointer implies a forward declaration. > > Unless something changed recently that only works for struct members, > function args need an explicit forward declaration. Not for pointers: "You can't use an incomplete type to declare a variable or field, or use it for a function parameter or return type. [...] However, you can define a pointer to an incomplete type, and declare a variable or field with such a pointer type. In general, you can do everything with such pointers except dereference them." https://gnu-c-language-manual.github.io/GNU-C-Language-Manual/Incomplete-Types.html That's the case here: struct r8169_led_classdev; struct r8169_led_classdev *rtl8168_init_leds(struct net_device *ndev); void r8169_remove_leds(struct r8169_led_classdev *leds); In this particular case, struct r8169_led_classdev is only used as a *pointer* passed to or returned from a function. There's no need for a forward declaration of the type behind the pointer. Thanks, Lukas
diff --git a/drivers/net/ethernet/realtek/r8169.h b/drivers/net/ethernet/realtek/r8169.h index 81567fcf3..1ef399287 100644 --- a/drivers/net/ethernet/realtek/r8169.h +++ b/drivers/net/ethernet/realtek/r8169.h @@ -72,6 +72,7 @@ enum mac_version { }; struct rtl8169_private; +struct r8169_led_classdev; void r8169_apply_firmware(struct rtl8169_private *tp); u16 rtl8168h_2_get_adc_bias_ioffset(struct rtl8169_private *tp); @@ -83,4 +84,5 @@ void r8169_get_led_name(struct rtl8169_private *tp, int idx, char *buf, int buf_len); int rtl8168_get_led_mode(struct rtl8169_private *tp); int rtl8168_led_mod_ctrl(struct rtl8169_private *tp, u16 mask, u16 val); -void rtl8168_init_leds(struct net_device *ndev); +struct r8169_led_classdev *rtl8168_init_leds(struct net_device *ndev); +void r8169_remove_leds(struct r8169_led_classdev *leds); diff --git a/drivers/net/ethernet/realtek/r8169_leds.c b/drivers/net/ethernet/realtek/r8169_leds.c index 007d077ed..1c97f3cca 100644 --- a/drivers/net/ethernet/realtek/r8169_leds.c +++ b/drivers/net/ethernet/realtek/r8169_leds.c @@ -138,20 +138,31 @@ static void rtl8168_setup_ldev(struct r8169_led_classdev *ldev, led_cdev->hw_control_get_device = r8169_led_hw_control_get_device; /* ignore errors */ - devm_led_classdev_register(&ndev->dev, led_cdev); + led_classdev_register(&ndev->dev, led_cdev); } -void rtl8168_init_leds(struct net_device *ndev) +struct r8169_led_classdev *rtl8168_init_leds(struct net_device *ndev) { - /* bind resource mgmt to netdev */ - struct device *dev = &ndev->dev; struct r8169_led_classdev *leds; int i; - leds = devm_kcalloc(dev, RTL8168_NUM_LEDS, sizeof(*leds), GFP_KERNEL); + leds = kcalloc(RTL8168_NUM_LEDS + 1, sizeof(*leds), GFP_KERNEL); if (!leds) - return; + return NULL; for (i = 0; i < RTL8168_NUM_LEDS; i++) rtl8168_setup_ldev(leds + i, ndev, i); + + return leds; +} + +void r8169_remove_leds(struct r8169_led_classdev *leds) +{ + if (!leds) + return; + + for (struct r8169_led_classdev *l = leds; l->ndev; l++) + led_classdev_unregister(&l->led); + + kfree(leds); } diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 4b6c28576..32b73f398 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -634,6 +634,8 @@ struct rtl8169_private { const char *fw_name; struct rtl_fw *rtl_fw; + struct r8169_led_classdev *leds; + u32 ocp_base; }; @@ -4930,6 +4932,9 @@ static void rtl_remove_one(struct pci_dev *pdev) cancel_work_sync(&tp->wk.work); + if (IS_ENABLED(CONFIG_R8169_LEDS)) + r8169_remove_leds(tp->leds); + unregister_netdev(tp->dev); if (tp->dash_type != RTL_DASH_NONE) @@ -5391,7 +5396,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) if (IS_ENABLED(CONFIG_R8169_LEDS) && tp->mac_version > RTL_GIGA_MAC_VER_06 && tp->mac_version < RTL_GIGA_MAC_VER_61) - rtl8168_init_leds(dev); + tp->leds = rtl8168_init_leds(dev); netdev_info(dev, "%s, %pM, XID %03x, IRQ %d\n", rtl_chip_infos[chipset].name, dev->dev_addr, xid, tp->irq);
Binding devm_led_classdev_register() to the netdev is problematic because on module removal we get a RTNL-related deadlock. Fix this by avoiding the device-managed LED functions. Note: We can safely call led_classdev_unregister() for a LED even if registering it failed, because led_classdev_unregister() detects this and is a no-op in this case. Fixes: 18764b883e15 ("r8169: add support for LED's on RTL8168/RTL8101") Cc: <stable@vger.kernel.org> # 6.8.x Reported-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> --- The original change was introduced with 6.8, 6.9 added support for LEDs on RTL8125. Therefore the first version of the fix applied on 6.9-rc only. This is the modified version for 6.8. --- drivers/net/ethernet/realtek/r8169.h | 4 +++- drivers/net/ethernet/realtek/r8169_leds.c | 23 +++++++++++++++++------ drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++- 3 files changed, 26 insertions(+), 8 deletions(-)