diff mbox series

[net,1/2] net: ravb: Fix missing rtnl lock in suspend path

Message ID 20250122-fix_missing_rtnl_lock_phy_disconnect-v1-1-8cb9f6f88fd1@bootlin.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Fix missing rtnl lock in suspend path | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 1 this patch: 1
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 44 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-01-22--21-00 (tests: 885)

Commit Message

Kory Maincent Jan. 22, 2025, 4:19 p.m. UTC
Fix the suspend path by ensuring the rtnl lock is held where required.
Calls to ravb_open, ravb_close and wol operations must be performed under
the rtnl lock to prevent conflicts with ongoing ndo operations.

Without this fix, the following warning is triggered:
[   39.032969] =============================
[   39.032983] WARNING: suspicious RCU usage
[   39.033019] -----------------------------
[   39.033033] drivers/net/phy/phy_device.c:2004 suspicious
rcu_dereference_protected() usage!
...
[   39.033597] stack backtrace:
[   39.033613] CPU: 0 UID: 0 PID: 174 Comm: python3 Not tainted
6.13.0-rc7-next-20250116-arm64-renesas-00002-g35245dfdc62c #7
[   39.033623] Hardware name: Renesas SMARC EVK version 2 based on
r9a08g045s33 (DT)
[   39.033628] Call trace:
[   39.033633]  show_stack+0x14/0x1c (C)
[   39.033652]  dump_stack_lvl+0xb4/0xc4
[   39.033664]  dump_stack+0x14/0x1c
[   39.033671]  lockdep_rcu_suspicious+0x16c/0x22c
[   39.033682]  phy_detach+0x160/0x190
[   39.033694]  phy_disconnect+0x40/0x54
[   39.033703]  ravb_close+0x6c/0x1cc
[   39.033714]  ravb_suspend+0x48/0x120
[   39.033721]  dpm_run_callback+0x4c/0x14c
[   39.033731]  device_suspend+0x11c/0x4dc
[   39.033740]  dpm_suspend+0xdc/0x214
[   39.033748]  dpm_suspend_start+0x48/0x60
[   39.033758]  suspend_devices_and_enter+0x124/0x574
[   39.033769]  pm_suspend+0x1ac/0x274
[   39.033778]  state_store+0x88/0x124
[   39.033788]  kobj_attr_store+0x14/0x24
[   39.033798]  sysfs_kf_write+0x48/0x6c
[   39.033808]  kernfs_fop_write_iter+0x118/0x1a8
[   39.033817]  vfs_write+0x27c/0x378
[   39.033825]  ksys_write+0x64/0xf4
[   39.033833]  __arm64_sys_write+0x18/0x20
[   39.033841]  invoke_syscall+0x44/0x104
[   39.033852]  el0_svc_common.constprop.0+0xb4/0xd4
[   39.033862]  do_el0_svc+0x18/0x20
[   39.033870]  el0_svc+0x3c/0xf0
[   39.033880]  el0t_64_sync_handler+0xc0/0xc4
[   39.033888]  el0t_64_sync+0x154/0x158
[   39.041274] ravb 11c30000.ethernet eth0: Link is Down

Reported-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Closes: https://lore.kernel.org/netdev/4c6419d8-c06b-495c-b987-d66c2e1ff848@tuxon.dev/
Fixes: 0184165b2f42 ("ravb: add sleep PM suspend/resume support")
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
---
 drivers/net/ethernet/renesas/ravb_main.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

Comments

Niklas Söderlund Jan. 22, 2025, 5:22 p.m. UTC | #1
Hi Kory,

Thanks for your work.

On 2025-01-22 17:19:28 +0100, Kory Maincent wrote:
> Fix the suspend path by ensuring the rtnl lock is held where required.
> Calls to ravb_open, ravb_close and wol operations must be performed under
> the rtnl lock to prevent conflicts with ongoing ndo operations.
> 
> Without this fix, the following warning is triggered:
> [   39.032969] =============================
> [   39.032983] WARNING: suspicious RCU usage
> [   39.033019] -----------------------------
> [   39.033033] drivers/net/phy/phy_device.c:2004 suspicious
> rcu_dereference_protected() usage!
> ...
> [   39.033597] stack backtrace:
> [   39.033613] CPU: 0 UID: 0 PID: 174 Comm: python3 Not tainted
> 6.13.0-rc7-next-20250116-arm64-renesas-00002-g35245dfdc62c #7
> [   39.033623] Hardware name: Renesas SMARC EVK version 2 based on
> r9a08g045s33 (DT)
> [   39.033628] Call trace:
> [   39.033633]  show_stack+0x14/0x1c (C)
> [   39.033652]  dump_stack_lvl+0xb4/0xc4
> [   39.033664]  dump_stack+0x14/0x1c
> [   39.033671]  lockdep_rcu_suspicious+0x16c/0x22c
> [   39.033682]  phy_detach+0x160/0x190
> [   39.033694]  phy_disconnect+0x40/0x54
> [   39.033703]  ravb_close+0x6c/0x1cc
> [   39.033714]  ravb_suspend+0x48/0x120
> [   39.033721]  dpm_run_callback+0x4c/0x14c
> [   39.033731]  device_suspend+0x11c/0x4dc
> [   39.033740]  dpm_suspend+0xdc/0x214
> [   39.033748]  dpm_suspend_start+0x48/0x60
> [   39.033758]  suspend_devices_and_enter+0x124/0x574
> [   39.033769]  pm_suspend+0x1ac/0x274
> [   39.033778]  state_store+0x88/0x124
> [   39.033788]  kobj_attr_store+0x14/0x24
> [   39.033798]  sysfs_kf_write+0x48/0x6c
> [   39.033808]  kernfs_fop_write_iter+0x118/0x1a8
> [   39.033817]  vfs_write+0x27c/0x378
> [   39.033825]  ksys_write+0x64/0xf4
> [   39.033833]  __arm64_sys_write+0x18/0x20
> [   39.033841]  invoke_syscall+0x44/0x104
> [   39.033852]  el0_svc_common.constprop.0+0xb4/0xd4
> [   39.033862]  do_el0_svc+0x18/0x20
> [   39.033870]  el0_svc+0x3c/0xf0
> [   39.033880]  el0t_64_sync_handler+0xc0/0xc4
> [   39.033888]  el0t_64_sync+0x154/0x158
> [   39.041274] ravb 11c30000.ethernet eth0: Link is Down
> 
> Reported-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> Closes: https://lore.kernel.org/netdev/4c6419d8-c06b-495c-b987-d66c2e1ff848@tuxon.dev/
> Fixes: 0184165b2f42 ("ravb: add sleep PM suspend/resume support")
> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>

I need to apply [1] to see the WARNING: suspicious RCU usage splat at 
all, I guess there is a WARN_ONCE somewhere. But with this patch applied 
the splat is gone when resuming and WoL works.

Tested on R-Car M3N with NFS root on the interface used for WoL.

Tested-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>

1. [PATCH] gpio: rcar: Use raw_spinlock to protect register access

> ---
>  drivers/net/ethernet/renesas/ravb_main.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
> index bc395294a32d..2c6d8e4966c3 100644
> --- a/drivers/net/ethernet/renesas/ravb_main.c
> +++ b/drivers/net/ethernet/renesas/ravb_main.c
> @@ -3217,10 +3217,15 @@ static int ravb_suspend(struct device *dev)
>  
>  	netif_device_detach(ndev);
>  
> -	if (priv->wol_enabled)
> -		return ravb_wol_setup(ndev);
> +	rtnl_lock();
> +	if (priv->wol_enabled) {
> +		ret = ravb_wol_setup(ndev);
> +		rtnl_unlock();
> +		return ret;
> +	}
>  
>  	ret = ravb_close(ndev);
> +	rtnl_unlock();
>  	if (ret)
>  		return ret;
>  
> @@ -3245,19 +3250,25 @@ static int ravb_resume(struct device *dev)
>  	if (!netif_running(ndev))
>  		return 0;
>  
> +	rtnl_lock();
>  	/* If WoL is enabled restore the interface. */
>  	if (priv->wol_enabled) {
>  		ret = ravb_wol_restore(ndev);
> -		if (ret)
> +		if (ret)  {
> +			rtnl_unlock();
>  			return ret;
> +		}
>  	} else {
>  		ret = pm_runtime_force_resume(dev);
> -		if (ret)
> +		if (ret) {
> +			rtnl_unlock();
>  			return ret;
> +		}
>  	}
>  
>  	/* Reopening the interface will restore the device to the working state. */
>  	ret = ravb_open(ndev);
> +	rtnl_unlock();
>  	if (ret < 0)
>  		goto out_rpm_put;
>  
> 
> -- 
> 2.34.1
>
Sergey Shtylyov Jan. 22, 2025, 6:33 p.m. UTC | #2
Hello!

  My Cogent Embedded tenure is long over, so I dropped my old email... :-)

On 1/22/25 7:19 PM, Kory Maincent wrote:

> Fix the suspend path by ensuring the rtnl lock is held where required.

   Maybe suspend/resume path (the same w/the subject)?

> Calls to ravb_open, ravb_close and wol operations must be performed under
> the rtnl lock to prevent conflicts with ongoing ndo operations.

[...]

> Reported-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> Closes: https://lore.kernel.org/netdev/4c6419d8-c06b-495c-b987-d66c2e1ff848@tuxon.dev/
> Fixes: 0184165b2f42 ("ravb: add sleep PM suspend/resume support")
> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>

    FWIW:

Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>

[...]
> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
> index bc395294a32d..2c6d8e4966c3 100644
> --- a/drivers/net/ethernet/renesas/ravb_main.c
> +++ b/drivers/net/ethernet/renesas/ravb_main.c
[...]
> @@ -3245,19 +3250,25 @@ static int ravb_resume(struct device *dev)
>  	if (!netif_running(ndev))
>  		return 0;
>  
> +	rtnl_lock();
>  	/* If WoL is enabled restore the interface. */
>  	if (priv->wol_enabled) {
>  		ret = ravb_wol_restore(ndev);
> -		if (ret)
> +		if (ret)  {
> +			rtnl_unlock();
>  			return ret;
> +		}
>  	} else {
>  		ret = pm_runtime_force_resume(dev);
> -		if (ret)
> +		if (ret) {
> +			rtnl_unlock();
>  			return ret;

   Hm, are you sure we need to have rtnl_lock around pm_runtime_force_resume() too?

[...]

MBR, Sergey
Sergey Shtylyov Jan. 22, 2025, 6:53 p.m. UTC | #3
On 1/22/25 9:33 PM, Sergey Shtylyov wrote:
[...]

>> Fix the suspend path by ensuring the rtnl lock is held where required.
> 
>    Maybe suspend/resume path (the same w/the subject)?
> 
>> Calls to ravb_open, ravb_close and wol operations must be performed under
>> the rtnl lock to prevent conflicts with ongoing ndo operations.
> 
> [...]
> 
>> Reported-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
>> Closes: https://lore.kernel.org/netdev/4c6419d8-c06b-495c-b987-d66c2e1ff848@tuxon.dev/
>> Fixes: 0184165b2f42 ("ravb: add sleep PM suspend/resume support")
>> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
> 
>     FWIW:
> 
> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
> 
> [...]
>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
>> index bc395294a32d..2c6d8e4966c3 100644
>> --- a/drivers/net/ethernet/renesas/ravb_main.c
>> +++ b/drivers/net/ethernet/renesas/ravb_main.c
> [...]
>> @@ -3245,19 +3250,25 @@ static int ravb_resume(struct device *dev)
>>  	if (!netif_running(ndev))
>>  		return 0;
>>  
>> +	rtnl_lock();
>>  	/* If WoL is enabled restore the interface. */
>>  	if (priv->wol_enabled) {
>>  		ret = ravb_wol_restore(ndev);
>> -		if (ret)
>> +		if (ret)  {
>> +			rtnl_unlock();
>>  			return ret;
>> +		}
>>  	} else {
>>  		ret = pm_runtime_force_resume(dev);
>> -		if (ret)
>> +		if (ret) {
>> +			rtnl_unlock();
>>  			return ret;
> 
>    Hm, are you sure we need to have rtnl_lock around pm_runtime_force_resume() too?
 
   Anyway, the above *if* statements are needlessly duplicated, I think it's time
that we do away with this! :-)

[...]

MBR, Sergey
Paul Barker Jan. 23, 2025, 9:30 a.m. UTC | #4
On 22/01/2025 16:19, Kory Maincent wrote:
> Fix the suspend path by ensuring the rtnl lock is held where required.
> Calls to ravb_open, ravb_close and wol operations must be performed under
> the rtnl lock to prevent conflicts with ongoing ndo operations.
> 
> Without this fix, the following warning is triggered:
> [   39.032969] =============================
> [   39.032983] WARNING: suspicious RCU usage
> [   39.033019] -----------------------------
> [   39.033033] drivers/net/phy/phy_device.c:2004 suspicious
> rcu_dereference_protected() usage!
> ...
> [   39.033597] stack backtrace:
> [   39.033613] CPU: 0 UID: 0 PID: 174 Comm: python3 Not tainted
> 6.13.0-rc7-next-20250116-arm64-renesas-00002-g35245dfdc62c #7
> [   39.033623] Hardware name: Renesas SMARC EVK version 2 based on
> r9a08g045s33 (DT)
> [   39.033628] Call trace:
> [   39.033633]  show_stack+0x14/0x1c (C)
> [   39.033652]  dump_stack_lvl+0xb4/0xc4
> [   39.033664]  dump_stack+0x14/0x1c
> [   39.033671]  lockdep_rcu_suspicious+0x16c/0x22c
> [   39.033682]  phy_detach+0x160/0x190
> [   39.033694]  phy_disconnect+0x40/0x54
> [   39.033703]  ravb_close+0x6c/0x1cc
> [   39.033714]  ravb_suspend+0x48/0x120
> [   39.033721]  dpm_run_callback+0x4c/0x14c
> [   39.033731]  device_suspend+0x11c/0x4dc
> [   39.033740]  dpm_suspend+0xdc/0x214
> [   39.033748]  dpm_suspend_start+0x48/0x60
> [   39.033758]  suspend_devices_and_enter+0x124/0x574
> [   39.033769]  pm_suspend+0x1ac/0x274
> [   39.033778]  state_store+0x88/0x124
> [   39.033788]  kobj_attr_store+0x14/0x24
> [   39.033798]  sysfs_kf_write+0x48/0x6c
> [   39.033808]  kernfs_fop_write_iter+0x118/0x1a8
> [   39.033817]  vfs_write+0x27c/0x378
> [   39.033825]  ksys_write+0x64/0xf4
> [   39.033833]  __arm64_sys_write+0x18/0x20
> [   39.033841]  invoke_syscall+0x44/0x104
> [   39.033852]  el0_svc_common.constprop.0+0xb4/0xd4
> [   39.033862]  do_el0_svc+0x18/0x20
> [   39.033870]  el0_svc+0x3c/0xf0
> [   39.033880]  el0t_64_sync_handler+0xc0/0xc4
> [   39.033888]  el0t_64_sync+0x154/0x158
> [   39.041274] ravb 11c30000.ethernet eth0: Link is Down
> 
> Reported-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> Closes: https://lore.kernel.org/netdev/4c6419d8-c06b-495c-b987-d66c2e1ff848@tuxon.dev/
> Fixes: 0184165b2f42 ("ravb: add sleep PM suspend/resume support")
> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>

I think we can simplify ravb_suspend() and ravb_resume() with a bit of
refactoring, but that can be done as a follow-up.

Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Claudiu Beznea Jan. 23, 2025, 11:33 a.m. UTC | #5
Hi, Kory,

On 22.01.2025 18:19, Kory Maincent wrote:
> Fix the suspend path by ensuring the rtnl lock is held where required.
> Calls to ravb_open, ravb_close and wol operations must be performed under
> the rtnl lock to prevent conflicts with ongoing ndo operations.
> 
> Without this fix, the following warning is triggered:
> [   39.032969] =============================
> [   39.032983] WARNING: suspicious RCU usage
> [   39.033019] -----------------------------
> [   39.033033] drivers/net/phy/phy_device.c:2004 suspicious
> rcu_dereference_protected() usage!
> ...
> [   39.033597] stack backtrace:
> [   39.033613] CPU: 0 UID: 0 PID: 174 Comm: python3 Not tainted
> 6.13.0-rc7-next-20250116-arm64-renesas-00002-g35245dfdc62c #7
> [   39.033623] Hardware name: Renesas SMARC EVK version 2 based on
> r9a08g045s33 (DT)
> [   39.033628] Call trace:
> [   39.033633]  show_stack+0x14/0x1c (C)
> [   39.033652]  dump_stack_lvl+0xb4/0xc4
> [   39.033664]  dump_stack+0x14/0x1c
> [   39.033671]  lockdep_rcu_suspicious+0x16c/0x22c
> [   39.033682]  phy_detach+0x160/0x190
> [   39.033694]  phy_disconnect+0x40/0x54
> [   39.033703]  ravb_close+0x6c/0x1cc
> [   39.033714]  ravb_suspend+0x48/0x120
> [   39.033721]  dpm_run_callback+0x4c/0x14c
> [   39.033731]  device_suspend+0x11c/0x4dc
> [   39.033740]  dpm_suspend+0xdc/0x214
> [   39.033748]  dpm_suspend_start+0x48/0x60
> [   39.033758]  suspend_devices_and_enter+0x124/0x574
> [   39.033769]  pm_suspend+0x1ac/0x274
> [   39.033778]  state_store+0x88/0x124
> [   39.033788]  kobj_attr_store+0x14/0x24
> [   39.033798]  sysfs_kf_write+0x48/0x6c
> [   39.033808]  kernfs_fop_write_iter+0x118/0x1a8
> [   39.033817]  vfs_write+0x27c/0x378
> [   39.033825]  ksys_write+0x64/0xf4
> [   39.033833]  __arm64_sys_write+0x18/0x20
> [   39.033841]  invoke_syscall+0x44/0x104
> [   39.033852]  el0_svc_common.constprop.0+0xb4/0xd4
> [   39.033862]  do_el0_svc+0x18/0x20
> [   39.033870]  el0_svc+0x3c/0xf0
> [   39.033880]  el0t_64_sync_handler+0xc0/0xc4
> [   39.033888]  el0t_64_sync+0x154/0x158
> [   39.041274] ravb 11c30000.ethernet eth0: Link is Down
> 
> Reported-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> Closes: https://lore.kernel.org/netdev/4c6419d8-c06b-495c-b987-d66c2e1ff848@tuxon.dev/
> Fixes: 0184165b2f42 ("ravb: add sleep PM suspend/resume support")
> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>

I've test it. Looks good.

Thank you for your patch. However, I think this could be simplified. The
locking scheme looks complicated to me. E.g., this one works too:

diff --git a/drivers/net/ethernet/renesas/ravb_main.c
b/drivers/net/ethernet/renesas/ravb_main.c
index bc395294a32d..cfe4f0f364f3 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -3217,10 +3217,16 @@ static int ravb_suspend(struct device *dev)

        netif_device_detach(ndev);

-       if (priv->wol_enabled)
-               return ravb_wol_setup(ndev);
+       if (priv->wol_enabled) {
+               rtnl_lock();
+               ret = ravb_wol_setup(ndev);
+               rtnl_unlock();
+               return ret;
+       }

+       rtnl_lock();
        ret = ravb_close(ndev);
+       rtnl_unlock();
        if (ret)
                return ret;

@@ -3247,7 +3253,9 @@ static int ravb_resume(struct device *dev)

        /* If WoL is enabled restore the interface. */
        if (priv->wol_enabled) {
+               rtnl_lock();
                ret = ravb_wol_restore(ndev);
+               rtnl_unlock();
                if (ret)
                        return ret;
        } else {
@@ -3257,7 +3265,9 @@ static int ravb_resume(struct device *dev)
        }

        /* Reopening the interface will restore the device to the working
state. */
+       rtnl_lock();
        ret = ravb_open(ndev);
+       rtnl_unlock();
        if (ret < 0)
                goto out_rpm_put;


> ---
>  drivers/net/ethernet/renesas/ravb_main.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
> index bc395294a32d..2c6d8e4966c3 100644
> --- a/drivers/net/ethernet/renesas/ravb_main.c
> +++ b/drivers/net/ethernet/renesas/ravb_main.c
> @@ -3217,10 +3217,15 @@ static int ravb_suspend(struct device *dev)
>  
>  	netif_device_detach(ndev);
>  
> -	if (priv->wol_enabled)
> -		return ravb_wol_setup(ndev);
> +	rtnl_lock();
> +	if (priv->wol_enabled) {
> +		ret = ravb_wol_setup(ndev);
> +		rtnl_unlock();
> +		return ret;
> +	}
>  
>  	ret = ravb_close(ndev);
> +	rtnl_unlock();
>  	if (ret)
>  		return ret;
>  
> @@ -3245,19 +3250,25 @@ static int ravb_resume(struct device *dev)
>  	if (!netif_running(ndev))
>  		return 0;
>  
> +	rtnl_lock();
>  	/* If WoL is enabled restore the interface. */
>  	if (priv->wol_enabled) {
>  		ret = ravb_wol_restore(ndev);
> -		if (ret)
> +		if (ret)  {
> +			rtnl_unlock();
>  			return ret;
> +		}
>  	} else {
>  		ret = pm_runtime_force_resume(dev);
> -		if (ret)
> +		if (ret) {
> +			rtnl_unlock();
>  			return ret;
> +		}
>  	}
>  
>  	/* Reopening the interface will restore the device to the working state. */
>  	ret = ravb_open(ndev);
> +	rtnl_unlock();
>  	if (ret < 0)
>  		goto out_rpm_put;
>  
>
Kory Maincent Jan. 23, 2025, 2:08 p.m. UTC | #6
On Thu, 23 Jan 2025 13:33:30 +0200
Claudiu Beznea <claudiu.beznea@tuxon.dev> wrote:

> Hi, Kory,
> 
> On 22.01.2025 18:19, Kory Maincent wrote:
> > Fix the suspend path by ensuring the rtnl lock is held where required.
> > Calls to ravb_open, ravb_close and wol operations must be performed under
> > the rtnl lock to prevent conflicts with ongoing ndo operations.

> 
> I've test it. Looks good.
> 
> Thank you for your patch. However, I think this could be simplified. The
> locking scheme looks complicated to me. E.g., this one works too:
> 
> diff --git a/drivers/net/ethernet/renesas/ravb_main.c
> b/drivers/net/ethernet/renesas/ravb_main.c
> index bc395294a32d..cfe4f0f364f3 100644
> --- a/drivers/net/ethernet/renesas/ravb_main.c
> +++ b/drivers/net/ethernet/renesas/ravb_main.c
> @@ -3217,10 +3217,16 @@ static int ravb_suspend(struct device *dev)
> 
>         netif_device_detach(ndev);
> 
> -       if (priv->wol_enabled)
> -               return ravb_wol_setup(ndev);
> +       if (priv->wol_enabled) {
> +               rtnl_lock();
> +               ret = ravb_wol_setup(ndev);
> +               rtnl_unlock();
> +               return ret;
> +       }

What happen if wol_enabled flag changes it state between the rtnl_lock and the
if condition? We will be in the wrong path.

Regards,
Claudiu Beznea Jan. 23, 2025, 2:17 p.m. UTC | #7
On 23.01.2025 16:08, Kory Maincent wrote:
> On Thu, 23 Jan 2025 13:33:30 +0200
> Claudiu Beznea <claudiu.beznea@tuxon.dev> wrote:
> 
>> Hi, Kory,
>>
>> On 22.01.2025 18:19, Kory Maincent wrote:
>>> Fix the suspend path by ensuring the rtnl lock is held where required.
>>> Calls to ravb_open, ravb_close and wol operations must be performed under
>>> the rtnl lock to prevent conflicts with ongoing ndo operations.
> 
>>
>> I've test it. Looks good.
>>
>> Thank you for your patch. However, I think this could be simplified. The
>> locking scheme looks complicated to me. E.g., this one works too:
>>
>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c
>> b/drivers/net/ethernet/renesas/ravb_main.c
>> index bc395294a32d..cfe4f0f364f3 100644
>> --- a/drivers/net/ethernet/renesas/ravb_main.c
>> +++ b/drivers/net/ethernet/renesas/ravb_main.c
>> @@ -3217,10 +3217,16 @@ static int ravb_suspend(struct device *dev)
>>
>>         netif_device_detach(ndev);
>>
>> -       if (priv->wol_enabled)
>> -               return ravb_wol_setup(ndev);
>> +       if (priv->wol_enabled) {
>> +               rtnl_lock();
>> +               ret = ravb_wol_setup(ndev);
>> +               rtnl_unlock();
>> +               return ret;
>> +       }
> 
> What happen if wol_enabled flag changes it state between the rtnl_lock and the
> if condition? We will be in the wrong path.

wol_enabled flag can't change in this suspend phase. The user space tasks
are fronzen when ravb_suspend() is called.

Thank you,
Claudiu

> 
> Regards,
Kory Maincent Jan. 23, 2025, 2:38 p.m. UTC | #8
On Thu, 23 Jan 2025 16:17:58 +0200
Claudiu Beznea <claudiu.beznea@tuxon.dev> wrote:

> On 23.01.2025 16:08, Kory Maincent wrote:
> > On Thu, 23 Jan 2025 13:33:30 +0200
> > Claudiu Beznea <claudiu.beznea@tuxon.dev> wrote:
> >   
> >> Hi, Kory,
> >>
> >> On 22.01.2025 18:19, Kory Maincent wrote:  
>  [...]  
> >   
> >>
> >> I've test it. Looks good.
> >>
> >> Thank you for your patch. However, I think this could be simplified. The
> >> locking scheme looks complicated to me. E.g., this one works too:
> >>
> >> diff --git a/drivers/net/ethernet/renesas/ravb_main.c
> >> b/drivers/net/ethernet/renesas/ravb_main.c
> >> index bc395294a32d..cfe4f0f364f3 100644
> >> --- a/drivers/net/ethernet/renesas/ravb_main.c
> >> +++ b/drivers/net/ethernet/renesas/ravb_main.c
> >> @@ -3217,10 +3217,16 @@ static int ravb_suspend(struct device *dev)
> >>
> >>         netif_device_detach(ndev);
> >>
> >> -       if (priv->wol_enabled)
> >> -               return ravb_wol_setup(ndev);
> >> +       if (priv->wol_enabled) {
> >> +               rtnl_lock();
> >> +               ret = ravb_wol_setup(ndev);
> >> +               rtnl_unlock();
> >> +               return ret;
> >> +       }  
> > 
> > What happen if wol_enabled flag changes it state between the rtnl_lock and
> > the if condition? We will be in the wrong path.  
> 
> wol_enabled flag can't change in this suspend phase. The user space tasks
> are fronzen when ravb_suspend() is called.

Oh ok, I don't now the suspend path but if it can't conflict we can got for your
proposition.

Regards,
diff mbox series

Patch

diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index bc395294a32d..2c6d8e4966c3 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -3217,10 +3217,15 @@  static int ravb_suspend(struct device *dev)
 
 	netif_device_detach(ndev);
 
-	if (priv->wol_enabled)
-		return ravb_wol_setup(ndev);
+	rtnl_lock();
+	if (priv->wol_enabled) {
+		ret = ravb_wol_setup(ndev);
+		rtnl_unlock();
+		return ret;
+	}
 
 	ret = ravb_close(ndev);
+	rtnl_unlock();
 	if (ret)
 		return ret;
 
@@ -3245,19 +3250,25 @@  static int ravb_resume(struct device *dev)
 	if (!netif_running(ndev))
 		return 0;
 
+	rtnl_lock();
 	/* If WoL is enabled restore the interface. */
 	if (priv->wol_enabled) {
 		ret = ravb_wol_restore(ndev);
-		if (ret)
+		if (ret)  {
+			rtnl_unlock();
 			return ret;
+		}
 	} else {
 		ret = pm_runtime_force_resume(dev);
-		if (ret)
+		if (ret) {
+			rtnl_unlock();
 			return ret;
+		}
 	}
 
 	/* Reopening the interface will restore the device to the working state. */
 	ret = ravb_open(ndev);
+	rtnl_unlock();
 	if (ret < 0)
 		goto out_rpm_put;