diff mbox series

SUNRPC: Don't disable preemption while calling svc_pool_for_cpu().

Message ID YnK2ujabd2+oCrT/@linutronix.de (mailing list archive)
State New
Headers show
Series SUNRPC: Don't disable preemption while calling svc_pool_for_cpu(). | expand

Commit Message

Sebastian Andrzej Siewior May 4, 2022, 5:24 p.m. UTC
svc_xprt_enqueue() disables preemption via get_cpu() and then asks for a
pool of a specific CPU (current) via svc_pool_for_cpu().
With disabled preemption it acquires svc_pool::sp_lock, a spinlock_t,
which is a sleeping lock on PREEMPT_RT and can't be acquired with
disabled preemption.

Disabling preemption is not required here. The pool is protected with a
lock so the following list access is safe even cross-CPU. The following
iteration through svc_pool::sp_all_threads is under RCU-readlock and
remaining operations within the loop are atomic and do not rely on
disabled-preemption.

Use raw_smp_processor_id() as the argument for the requested CPU in
svc_pool_for_cpu().

Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 net/sunrpc/svc_xprt.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

Comments

Chuck Lever III May 5, 2022, 3:19 a.m. UTC | #1
On May 4, 2022, at 10:24 AM, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> 
> svc_xprt_enqueue() disables preemption via get_cpu() and then asks for a
> pool of a specific CPU (current) via svc_pool_for_cpu().
> With disabled preemption it acquires svc_pool::sp_lock, a spinlock_t,
> which is a sleeping lock on PREEMPT_RT and can't be acquired with
> disabled preemption.
> 
> Disabling preemption is not required here. The pool is protected with a
> lock so the following list access is safe even cross-CPU. The following
> iteration through svc_pool::sp_all_threads is under RCU-readlock and
> remaining operations within the loop are atomic and do not rely on
> disabled-preemption.
> 
> Use raw_smp_processor_id() as the argument for the requested CPU in
> svc_pool_for_cpu().
> 
> Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Sebastian, I will have a closer look in a few days. Meanwhile, if I’m reading the patch description right, this is a bug fix. Was there a lockdep splat (ie, how did you find this issue)? Does it belong in 5.18-rc? Should it have a Fixes: tag?


> ---
> net/sunrpc/svc_xprt.c | 5 +----
> 1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 5b59e2103526e..79965deec5b12 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -448,7 +448,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
> {
>    struct svc_pool *pool;
>    struct svc_rqst    *rqstp = NULL;
> -    int cpu;
> 
>    if (!svc_xprt_ready(xprt))
>        return;
> @@ -461,8 +460,7 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
>    if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags))
>        return;
> 
> -    cpu = get_cpu();
> -    pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
> +    pool = svc_pool_for_cpu(xprt->xpt_server, raw_smp_processor_id());
> 
>    atomic_long_inc(&pool->sp_stats.packets);
> 
> @@ -485,7 +483,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
>    rqstp = NULL;
> out_unlock:
>    rcu_read_unlock();
> -    put_cpu();
>    trace_svc_xprt_enqueue(xprt, rqstp);
> }
> EXPORT_SYMBOL_GPL(svc_xprt_enqueue);
> -- 
> 2.36.0
>
Sebastian Andrzej Siewior May 5, 2022, 6:26 a.m. UTC | #2
On 2022-05-05 03:19:57 [+0000], Chuck Lever III wrote:
> 
> Sebastian, I will have a closer look in a few days. Meanwhile, if I’m
> reading the patch description right, this is a bug fix. Was there a
> lockdep splat (ie, how did you find this issue)? Does it belong in
> 5.18-rc? Should it have a Fixes: tag?

It is a bugfix for PREEMPT_RT and I'm posting this as part of getting
PREEMPT_RT merged upstream. This is already fixed in the PREEMPT_RT
queue. 
There is no need to get this merged in the current -rc series since
!PREEMPT_RT is not affected. Next release is fine.

Sebastian
Chuck Lever III May 9, 2022, 4:56 p.m. UTC | #3
Hi Sebastian-


> On May 4, 2022, at 1:24 PM, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> 
> svc_xprt_enqueue() disables preemption via get_cpu() and then asks for a
> pool of a specific CPU (current) via svc_pool_for_cpu().
> With disabled preemption it acquires svc_pool::sp_lock, a spinlock_t,
> which is a sleeping lock on PREEMPT_RT and can't be acquired with
> disabled preemption.

I found this paragraph a little unclear. If you repost, I'd suggest:

    svc_xprt_enqueue() disables preemption via get_cpu() and then asks
    for a pool of a specific CPU (current) via svc_pool_for_cpu().
    While preemption is disabled, svc_xprt_enqueue() acquires
    svc_pool::sp_lock with bottom-halfs disabled, which can sleep on
    PREEMPT_RT.


> Disabling preemption is not required here. The pool is protected with a
> lock so the following list access is safe even cross-CPU. The following
> iteration through svc_pool::sp_all_threads is under RCU-readlock and
> remaining operations within the loop are atomic and do not rely on
> disabled-preemption.
> 
> Use raw_smp_processor_id() as the argument for the requested CPU in
> svc_pool_for_cpu().
> 
> Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> net/sunrpc/svc_xprt.c | 5 +----
> 1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 5b59e2103526e..79965deec5b12 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -448,7 +448,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
> {
> 	struct svc_pool *pool;
> 	struct svc_rqst	*rqstp = NULL;
> -	int cpu;
> 
> 	if (!svc_xprt_ready(xprt))
> 		return;
> @@ -461,8 +460,7 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
> 	if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags))
> 		return;
> 
> -	cpu = get_cpu();
> -	pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
> +	pool = svc_pool_for_cpu(xprt->xpt_server, raw_smp_processor_id());

The pre-2014 code here was this:

	cpu = get_cpu();
	pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
	put_cpu();

Your explanation covers the rationale for leaving pre-emption
enabled in the body of svc_xprt_enqueue(), but it's not clear
to me that rationale also applies to svc_pool_for_cpu().

Protecting the svc_pool data structures with RCU might be
better, but would amount to a more extensive change. To address
the pre-emption issue quickly, you could simply revert this
call site back to its pre-2014 form and postpone deeper changes.

I have an NFS server in my lab on NUMA hardware and can run
some tests this week with this patch.


> 	atomic_long_inc(&pool->sp_stats.packets);
> 
> @@ -485,7 +483,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
> 	rqstp = NULL;
> out_unlock:
> 	rcu_read_unlock();
> -	put_cpu();
> 	trace_svc_xprt_enqueue(xprt, rqstp);
> }
> EXPORT_SYMBOL_GPL(svc_xprt_enqueue);
> -- 
> 2.36.0
> 

--
Chuck Lever
Sebastian Andrzej Siewior May 10, 2022, 12:02 p.m. UTC | #4
On 2022-05-09 16:56:47 [+0000], Chuck Lever III wrote:
> Hi Sebastian-
Hi Chuck,

> > On May 4, 2022, at 1:24 PM, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> > 
> > svc_xprt_enqueue() disables preemption via get_cpu() and then asks for a
> > pool of a specific CPU (current) via svc_pool_for_cpu().
> > With disabled preemption it acquires svc_pool::sp_lock, a spinlock_t,
> > which is a sleeping lock on PREEMPT_RT and can't be acquired with
> > disabled preemption.
> 
> I found this paragraph a little unclear. If you repost, I'd suggest:
> 
>     svc_xprt_enqueue() disables preemption via get_cpu() and then asks
>     for a pool of a specific CPU (current) via svc_pool_for_cpu().
>     While preemption is disabled, svc_xprt_enqueue() acquires
>     svc_pool::sp_lock with bottom-halfs disabled, which can sleep on
>     PREEMPT_RT.

Sure.

> > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> > index 5b59e2103526e..79965deec5b12 100644
> > --- a/net/sunrpc/svc_xprt.c
> > +++ b/net/sunrpc/svc_xprt.c
> > @@ -461,8 +460,7 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
> > 	if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags))
> > 		return;
> > 
> > -	cpu = get_cpu();
> > -	pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
> > +	pool = svc_pool_for_cpu(xprt->xpt_server, raw_smp_processor_id());
> 
> The pre-2014 code here was this:
> 
> 	cpu = get_cpu();
> 	pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
> 	put_cpu();
> 
> Your explanation covers the rationale for leaving pre-emption
> enabled in the body of svc_xprt_enqueue(), but it's not clear
> to me that rationale also applies to svc_pool_for_cpu().

I don't see why svc_pool_for_cpu() should be protected by disabling
preemption. It returns a "struct svc_pool" depending on the CPU number.
In the SVC_POOL_PERNODE case it will return the same pointer/ struct for
two different CPUs which belong to the same node.

> Protecting the svc_pool data structures with RCU might be
> better, but would amount to a more extensive change. To address
> the pre-emption issue quickly, you could simply revert this
> call site back to its pre-2014 form and postpone deeper changes.

You mean before commit
   983c684466e02 ("SUNRPC: get rid of the request wait queue")

I'm not sure how RCU would help here. It would ensure that the pool does
not vanish within the RCU section. The pool remains around until
svc_destroy() and I assume that everything pool related (like
svc_pool::sp_sockets) is gone by then.

> I have an NFS server in my lab on NUMA hardware and can run
> some tests this week with this patch.

I'm a little confused now. I could repost the patch with an updated
description as you suggested _or_ limit the get_cpu()/preempt-disabled
section to be only around svc_pool_for_cpu(). I don't see the reason for
it but will do as requested (if you want me to) since it doesn't cause
any problems.

Sebastian
Chuck Lever III May 10, 2022, 1:38 p.m. UTC | #5
> On May 10, 2022, at 8:02 AM, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> 
> On 2022-05-09 16:56:47 [+0000], Chuck Lever III wrote:
>> Hi Sebastian-
> Hi Chuck,
> 
>>> On May 4, 2022, at 1:24 PM, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
>>> 
>>> svc_xprt_enqueue() disables preemption via get_cpu() and then asks for a
>>> pool of a specific CPU (current) via svc_pool_for_cpu().
>>> With disabled preemption it acquires svc_pool::sp_lock, a spinlock_t,
>>> which is a sleeping lock on PREEMPT_RT and can't be acquired with
>>> disabled preemption.
>> 
>> I found this paragraph a little unclear. If you repost, I'd suggest:
>> 
>>    svc_xprt_enqueue() disables preemption via get_cpu() and then asks
>>    for a pool of a specific CPU (current) via svc_pool_for_cpu().
>>    While preemption is disabled, svc_xprt_enqueue() acquires
>>    svc_pool::sp_lock with bottom-halfs disabled, which can sleep on
>>    PREEMPT_RT.
> 
> Sure.
> 
>>> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
>>> index 5b59e2103526e..79965deec5b12 100644
>>> --- a/net/sunrpc/svc_xprt.c
>>> +++ b/net/sunrpc/svc_xprt.c
>>> @@ -461,8 +460,7 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
>>> 	if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags))
>>> 		return;
>>> 
>>> -	cpu = get_cpu();
>>> -	pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
>>> +	pool = svc_pool_for_cpu(xprt->xpt_server, raw_smp_processor_id());
>> 
>> The pre-2014 code here was this:
>> 
>> 	cpu = get_cpu();
>> 	pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
>> 	put_cpu();
>> 
>> Your explanation covers the rationale for leaving pre-emption
>> enabled in the body of svc_xprt_enqueue(), but it's not clear
>> to me that rationale also applies to svc_pool_for_cpu().
> 
> I don't see why svc_pool_for_cpu() should be protected by disabling
> preemption. It returns a "struct svc_pool" depending on the CPU number.
> In the SVC_POOL_PERNODE case it will return the same pointer/ struct for
> two different CPUs which belong to the same node.

Basically, why would bfd241600a3b ("[PATCH] knfsd: make rpc
threads pools numa aware") use get_cpu/put_cpu when
raw_smp_processor_id() was available to it? Looking through
the commit log, I can't see that it is necessary, but I
needed to convince myself that it was just a coding whim
and not done purposely to protect that function.

And, note that svc_pool_map is a read-write data structure.
I needed a little more review to convince myself that the
data structure cannot change once it has been initialized.

Third, my testing so far shows your patch does not introduce
any regression.

So I'm feeling more confident. Let's do this:

- Get rid of the @cpu argument to svc_pool_for_cpu() and
  do the raw_smp_processor_id() call inside that function.
  Please add a kerneldoc comment for svc_pool_for_cpu()
  that states the current CPU is an implicit argument.

- Fix up the patch description as above.

- Post a v2


>> Protecting the svc_pool data structures with RCU might be
>> better, but would amount to a more extensive change. To address
>> the pre-emption issue quickly, you could simply revert this
>> call site back to its pre-2014 form and postpone deeper changes.
> 
> You mean before commit
>   983c684466e02 ("SUNRPC: get rid of the request wait queue")
> 
> I'm not sure how RCU would help here. It would ensure that the pool does
> not vanish within the RCU section. The pool remains around until
> svc_destroy() and I assume that everything pool related (like
> svc_pool::sp_sockets) is gone by then.
> 
>> I have an NFS server in my lab on NUMA hardware and can run
>> some tests this week with this patch.
> 
> I'm a little confused now. I could repost the patch with an updated
> description as you suggested _or_ limit the get_cpu()/preempt-disabled
> section to be only around svc_pool_for_cpu(). I don't see the reason for
> it but will do as requested (if you want me to) since it doesn't cause
> any problems.

I think I'm convinced that using raw_smp_processor_id() is
safe to do, and it is certainly preferable in a performance
path to not toggle pre-emption at all.


--
Chuck Lever
diff mbox series

Patch

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 5b59e2103526e..79965deec5b12 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -448,7 +448,6 @@  void svc_xprt_enqueue(struct svc_xprt *xprt)
 {
 	struct svc_pool *pool;
 	struct svc_rqst	*rqstp = NULL;
-	int cpu;
 
 	if (!svc_xprt_ready(xprt))
 		return;
@@ -461,8 +460,7 @@  void svc_xprt_enqueue(struct svc_xprt *xprt)
 	if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags))
 		return;
 
-	cpu = get_cpu();
-	pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
+	pool = svc_pool_for_cpu(xprt->xpt_server, raw_smp_processor_id());
 
 	atomic_long_inc(&pool->sp_stats.packets);
 
@@ -485,7 +483,6 @@  void svc_xprt_enqueue(struct svc_xprt *xprt)
 	rqstp = NULL;
 out_unlock:
 	rcu_read_unlock();
-	put_cpu();
 	trace_svc_xprt_enqueue(xprt, rqstp);
 }
 EXPORT_SYMBOL_GPL(svc_xprt_enqueue);