diff mbox series

[RFC,net] net: ipconfig: Release the rtnl_lock while waiting for carrier

Message ID 20211027131953.9270-1-maxime.chevallier@bootlin.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [RFC,net] net: ipconfig: Release the rtnl_lock while waiting for carrier | expand

Checks

Context Check Description
netdev/cover_letter success Single patches do not need cover letters
netdev/fixes_present success Fixes tag present in non-next series
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Fixes tag looks correct
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 12 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/header_inline success No static functions without inline keyword in header files

Commit Message

Maxime Chevallier Oct. 27, 2021, 1:19 p.m. UTC
While waiting for a carrier to come on one of the netdevices, some
devices will require to take the rtnl lock at some point to fully
initialize all parts of the link.

That's the case for SFP, where the rtnl is taken when a module gets
detected. This prevents mounting an NFS rootfs over an SFP link.

This means that while ipconfig waits for carriers to be detected, no SFP
modules can be detected in the meantime, it's only detected after
ipconfig times out.

This commit releases the rtnl_lock while waiting for the carrier to come
up, and re-takes it to check the for the init device and carrier status.

At that point, the rtnl_lock seems to be only protecting
ic_is_init_dev().

Fixes: 73970055450e ("sfp: add SFP module support")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
---
I've sent this patch as an RFC (it doesn't look very clean indeed), since I'm
not fully familiar with the implications of modifying the locking scheme at
that point in the boot process. Please feel free to comment or suggest other
approaches.

 net/ipv4/ipconfig.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Antoine Tenart Oct. 27, 2021, 4:05 p.m. UTC | #1
Hi Maxime,

Quoting Maxime Chevallier (2021-10-27 15:19:53)
> While waiting for a carrier to come on one of the netdevices, some
> devices will require to take the rtnl lock at some point to fully
> initialize all parts of the link.
> 
> That's the case for SFP, where the rtnl is taken when a module gets
> detected. This prevents mounting an NFS rootfs over an SFP link.
> 
> This means that while ipconfig waits for carriers to be detected, no SFP
> modules can be detected in the meantime, it's only detected after
> ipconfig times out.
> 
> This commit releases the rtnl_lock while waiting for the carrier to come
> up, and re-takes it to check the for the init device and carrier status.
> 
> At that point, the rtnl_lock seems to be only protecting
> ic_is_init_dev().
> 
> Fixes: 73970055450e ("sfp: add SFP module support")

Was this working with SFP modules before?

> diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
> index 816d8aad5a68..069ae05bd0a5 100644
> --- a/net/ipv4/ipconfig.c
> +++ b/net/ipv4/ipconfig.c
> @@ -278,7 +278,12 @@ static int __init ic_open_devs(void)
>                         if (ic_is_init_dev(dev) && netif_carrier_ok(dev))
>                                 goto have_carrier;
>  
> +               /* Give a chance to do complex initialization that
> +                * would require to take the rtnl lock.
> +                */
> +               rtnl_unlock();
>                 msleep(1);
> +               rtnl_lock();
>  
>                 if (time_before(jiffies, next_msg))
>                         continue;

The rtnl lock is protecting 'for_each_netdev' and 'dev_change_flags' in
this function. What could happen in theory is a device gets removed from
the list or has its flags changed. I don't think that's an issue here.

Instead of releasing the lock while sleeping, you could drop the lock
before the carrier waiting loop (with a similar comment) and only
protect the above 'for_each_netdev' loop.

Antoine
Maxime Chevallier Oct. 28, 2021, 6:45 a.m. UTC | #2
Hello Antoine,

On Wed, 27 Oct 2021 18:05:09 +0200
Antoine Tenart <atenart@kernel.org> wrote:

>Hi Maxime,
>
>Quoting Maxime Chevallier (2021-10-27 15:19:53)
>> While waiting for a carrier to come on one of the netdevices, some
>> devices will require to take the rtnl lock at some point to fully
>> initialize all parts of the link.
>> 
>> That's the case for SFP, where the rtnl is taken when a module gets
>> detected. This prevents mounting an NFS rootfs over an SFP link.
>> 
>> This means that while ipconfig waits for carriers to be detected, no SFP
>> modules can be detected in the meantime, it's only detected after
>> ipconfig times out.
>> 
>> This commit releases the rtnl_lock while waiting for the carrier to come
>> up, and re-takes it to check the for the init device and carrier status.
>> 
>> At that point, the rtnl_lock seems to be only protecting
>> ic_is_init_dev().
>> 
>> Fixes: 73970055450e ("sfp: add SFP module support")  
>
>Was this working with SFP modules before?

From what I can tell, no. In that case, does it need a fixes tag ?
It seems the problem has always been there, and booting an nfsroot
never worked over SFP links.

>
>> diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
>> index 816d8aad5a68..069ae05bd0a5 100644
>> --- a/net/ipv4/ipconfig.c
>> +++ b/net/ipv4/ipconfig.c
>> @@ -278,7 +278,12 @@ static int __init ic_open_devs(void)
>>                         if (ic_is_init_dev(dev) && netif_carrier_ok(dev))
>>                                 goto have_carrier;
>>  
>> +               /* Give a chance to do complex initialization that
>> +                * would require to take the rtnl lock.
>> +                */
>> +               rtnl_unlock();
>>                 msleep(1);
>> +               rtnl_lock();
>>  
>>                 if (time_before(jiffies, next_msg))
>>                         continue;  
>
>The rtnl lock is protecting 'for_each_netdev' and 'dev_change_flags' in
>this function. What could happen in theory is a device gets removed from
>the list or has its flags changed. I don't think that's an issue here.
>
>Instead of releasing the lock while sleeping, you could drop the lock
>before the carrier waiting loop (with a similar comment) and only
>protect the above 'for_each_netdev' loop.

Nice catch, the effect should be the same but with a much cleaner idea
of what is being protected.

I'll give it a try and respin, thanks for the review !

Maxime

>Antoine
Antoine Tenart Oct. 28, 2021, 8:41 a.m. UTC | #3
Quoting Maxime Chevallier (2021-10-28 08:45:20)
> On Wed, 27 Oct 2021 18:05:09 +0200
> Antoine Tenart <atenart@kernel.org> wrote:
> >Quoting Maxime Chevallier (2021-10-27 15:19:53)
> >> 
> >> Fixes: 73970055450e ("sfp: add SFP module support")  
> >
> >Was this working with SFP modules before?
> 
> From what I can tell, no. In that case, does it need a fixes tag ?
> It seems the problem has always been there, and booting an nfsroot
> never worked over SFP links.

In that case I'd say targeting net-next is fine.

Antoine
diff mbox series

Patch

diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
index 816d8aad5a68..069ae05bd0a5 100644
--- a/net/ipv4/ipconfig.c
+++ b/net/ipv4/ipconfig.c
@@ -278,7 +278,12 @@  static int __init ic_open_devs(void)
 			if (ic_is_init_dev(dev) && netif_carrier_ok(dev))
 				goto have_carrier;
 
+		/* Give a chance to do complex initialization that
+		 * would require to take the rtnl lock.
+		 */
+		rtnl_unlock();
 		msleep(1);
+		rtnl_lock();
 
 		if (time_before(jiffies, next_msg))
 			continue;