mbox series

[v3,resend,0/6] Implement call_rcu_lazy() and miscellaneous fixes

Message ID 20220809034517.3867176-1-joel@joelfernandes.org (mailing list archive)
Headers show
Series Implement call_rcu_lazy() and miscellaneous fixes | expand

Message

Joel Fernandes Aug. 9, 2022, 3:45 a.m. UTC
Just a refresh of v3 with one additional debug patch. v3's cover letter is here:
 https://lore.kernel.org/all/20220713213237.1596225-1-joel@joelfernandes.org/

I just started working on this again while I have some time during paternity
leave ;-) So I thought I'll just send it out again. No other changes other
than that 1 debug patch I added on the top.

Next I am going to go refine the power results as mentioned in Paul's comments
on the last cover letter.

Joel Fernandes (Google) (5):
rcu: Introduce call_rcu_lazy() API implementation
rcuscale: Add laziness and kfree tests
fs: Move call_rcu() to call_rcu_lazy() in some paths
rcutorture: Add test code for call_rcu_lazy()
debug: Toggle lazy at runtime and change flush jiffies

Vineeth Pillai (1):
rcu: shrinker for lazy rcu

fs/dcache.c                                   |   4 +-
fs/eventpoll.c                                |   2 +-
fs/file_table.c                               |   2 +-
fs/inode.c                                    |   2 +-
include/linux/rcu_segcblist.h                 |   1 +
include/linux/rcupdate.h                      |   6 +
include/linux/sched/sysctl.h                  |   3 +
kernel/rcu/Kconfig                            |   8 +
kernel/rcu/rcu.h                              |  12 +
kernel/rcu/rcu_segcblist.c                    |  15 +-
kernel/rcu/rcu_segcblist.h                    |  20 +-
kernel/rcu/rcuscale.c                         |  74 +++++-
kernel/rcu/rcutorture.c                       |  60 ++++-
kernel/rcu/tree.c                             | 131 ++++++----
kernel/rcu/tree.h                             |  10 +-
kernel/rcu/tree_nocb.h                        | 246 +++++++++++++++---
kernel/sysctl.c                               |  17 ++
.../selftests/rcutorture/configs/rcu/CFLIST   |   1 +
.../selftests/rcutorture/configs/rcu/TREE11   |  18 ++
.../rcutorture/configs/rcu/TREE11.boot        |   8 +
20 files changed, 536 insertions(+), 104 deletions(-)
create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/TREE11
create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/TREE11.boot

--
2.37.1.559.g78731f0fdb-goog

Comments

Joel Fernandes Aug. 11, 2022, 2:23 a.m. UTC | #1
On 8/8/2022 11:45 PM, Joel Fernandes (Google) wrote:
> Just a refresh of v3 with one additional debug patch. v3's cover letter is here:
>  https://lore.kernel.org/all/20220713213237.1596225-1-joel@joelfernandes.org/
> 
> I just started working on this again while I have some time during paternity
> leave ;-) So I thought I'll just send it out again. No other changes other
> than that 1 debug patch I added on the top.
> 
> Next I am going to go refine the power results as mentioned in Paul's comments
> on the last cover letter.

Side note: Here is another big selling point for call_rcu_lazy().
Instead of _lazy(), if you just increased jiffies_till_first_fqs, and
slowed *all* call_rcu() down to achieve the same effect, that would
affect percpu refcounters switching to atomic-mode, for example.

They switch to atomic mode by calling __percpu_ref_switch_mode() which
is called by percpu_ref_switch_to_atomic_sync().

This will slow this call down for the full lazy duration which will slow
down suspend in blk_pre_runtime_suspend().

This is why, we cannot assume call_rcu() users will mostly just want to
free memory. There could be cases just like this, and just blanket slow
down of call_rcu() might bite at unexpected times.

I am going to add this as a selling point for selective lazyfication
(hey I get to invent words while I'm inventing new features), to my
cover letter and slides.

 - Joel



> 
> Joel Fernandes (Google) (5):
> rcu: Introduce call_rcu_lazy() API implementation
> rcuscale: Add laziness and kfree tests
> fs: Move call_rcu() to call_rcu_lazy() in some paths
> rcutorture: Add test code for call_rcu_lazy()
> debug: Toggle lazy at runtime and change flush jiffies
> 
> Vineeth Pillai (1):
> rcu: shrinker for lazy rcu
> 
> fs/dcache.c                                   |   4 +-
> fs/eventpoll.c                                |   2 +-
> fs/file_table.c                               |   2 +-
> fs/inode.c                                    |   2 +-
> include/linux/rcu_segcblist.h                 |   1 +
> include/linux/rcupdate.h                      |   6 +
> include/linux/sched/sysctl.h                  |   3 +
> kernel/rcu/Kconfig                            |   8 +
> kernel/rcu/rcu.h                              |  12 +
> kernel/rcu/rcu_segcblist.c                    |  15 +-
> kernel/rcu/rcu_segcblist.h                    |  20 +-
> kernel/rcu/rcuscale.c                         |  74 +++++-
> kernel/rcu/rcutorture.c                       |  60 ++++-
> kernel/rcu/tree.c                             | 131 ++++++----
> kernel/rcu/tree.h                             |  10 +-
> kernel/rcu/tree_nocb.h                        | 246 +++++++++++++++---
> kernel/sysctl.c                               |  17 ++
> .../selftests/rcutorture/configs/rcu/CFLIST   |   1 +
> .../selftests/rcutorture/configs/rcu/TREE11   |  18 ++
> .../rcutorture/configs/rcu/TREE11.boot        |   8 +
> 20 files changed, 536 insertions(+), 104 deletions(-)
> create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/TREE11
> create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/TREE11.boot
> 
> --
> 2.37.1.559.g78731f0fdb-goog
>
Joel Fernandes Aug. 11, 2022, 2:31 a.m. UTC | #2
On 8/10/2022 10:23 PM, Joel Fernandes wrote:
> 
> 
> On 8/8/2022 11:45 PM, Joel Fernandes (Google) wrote:
>> Just a refresh of v3 with one additional debug patch. v3's cover letter is here:
>>  https://lore.kernel.org/all/20220713213237.1596225-1-joel@joelfernandes.org/
>>
>> I just started working on this again while I have some time during paternity
>> leave ;-) So I thought I'll just send it out again. No other changes other
>> than that 1 debug patch I added on the top.
>>
>> Next I am going to go refine the power results as mentioned in Paul's comments
>> on the last cover letter.
> 
> Side note: Here is another big selling point for call_rcu_lazy().
> Instead of _lazy(), if you just increased jiffies_till_first_fqs, and
> slowed *all* call_rcu() down to achieve the same effect, that would
> affect percpu refcounters switching to atomic-mode, for example.
> 
> They switch to atomic mode by calling __percpu_ref_switch_mode() which
> is called by percpu_ref_switch_to_atomic_sync().>
> This will slow this call down for the full lazy duration which will slow
> down suspend in blk_pre_runtime_suspend().

Correction while I am going on the record (got to be careful these
days). It *might* slow down RCU for the full lazy duration, unless of
course a fly-by rescue call_rcu() comes in.

- Joel
Paul E. McKenney Aug. 11, 2022, 2:51 a.m. UTC | #3
On Wed, Aug 10, 2022 at 10:31:56PM -0400, Joel Fernandes wrote:
> 
> 
> On 8/10/2022 10:23 PM, Joel Fernandes wrote:
> > 
> > 
> > On 8/8/2022 11:45 PM, Joel Fernandes (Google) wrote:
> >> Just a refresh of v3 with one additional debug patch. v3's cover letter is here:
> >>  https://lore.kernel.org/all/20220713213237.1596225-1-joel@joelfernandes.org/
> >>
> >> I just started working on this again while I have some time during paternity
> >> leave ;-) So I thought I'll just send it out again. No other changes other
> >> than that 1 debug patch I added on the top.
> >>
> >> Next I am going to go refine the power results as mentioned in Paul's comments
> >> on the last cover letter.
> > 
> > Side note: Here is another big selling point for call_rcu_lazy().
> > Instead of _lazy(), if you just increased jiffies_till_first_fqs, and
> > slowed *all* call_rcu() down to achieve the same effect, that would
> > affect percpu refcounters switching to atomic-mode, for example.
> > 
> > They switch to atomic mode by calling __percpu_ref_switch_mode() which
> > is called by percpu_ref_switch_to_atomic_sync().>
> > This will slow this call down for the full lazy duration which will slow
> > down suspend in blk_pre_runtime_suspend().
> 
> Correction while I am going on the record (got to be careful these
> days). It *might* slow down RCU for the full lazy duration, unless of
> course a fly-by rescue call_rcu() comes in.

Just unload a module, which if I remember correctly invokes rcu_barrier().
Lots of rescue callbacks.  ;-)

							Thanx, Paul
Joel Fernandes Aug. 11, 2022, 3:22 a.m. UTC | #4
On 8/10/2022 10:51 PM, Paul E. McKenney wrote:
> On Wed, Aug 10, 2022 at 10:31:56PM -0400, Joel Fernandes wrote:
>>
>>
>> On 8/10/2022 10:23 PM, Joel Fernandes wrote:
>>>
>>>
>>> On 8/8/2022 11:45 PM, Joel Fernandes (Google) wrote:
>>>> Just a refresh of v3 with one additional debug patch. v3's cover letter is here:
>>>>  https://lore.kernel.org/all/20220713213237.1596225-1-joel@joelfernandes.org/
>>>>
>>>> I just started working on this again while I have some time during paternity
>>>> leave ;-) So I thought I'll just send it out again. No other changes other
>>>> than that 1 debug patch I added on the top.
>>>>
>>>> Next I am going to go refine the power results as mentioned in Paul's comments
>>>> on the last cover letter.
>>>
>>> Side note: Here is another big selling point for call_rcu_lazy().
>>> Instead of _lazy(), if you just increased jiffies_till_first_fqs, and
>>> slowed *all* call_rcu() down to achieve the same effect, that would
>>> affect percpu refcounters switching to atomic-mode, for example.
>>>
>>> They switch to atomic mode by calling __percpu_ref_switch_mode() which
>>> is called by percpu_ref_switch_to_atomic_sync().>
>>> This will slow this call down for the full lazy duration which will slow
>>> down suspend in blk_pre_runtime_suspend().
>>
>> Correction while I am going on the record (got to be careful these
>> days). It *might* slow down RCU for the full lazy duration, unless of
>> course a fly-by rescue call_rcu() comes in.
> 
> Just unload a module, which if I remember correctly invokes rcu_barrier().
> Lots of rescue callbacks.  ;-)

Haha. Yes I suppose the per-cpu atomic switch paths can also invoke
rcu_barrier() but I suspect somebody might complain about IPIs :-P

Thanks,

 - Joel
Paul E. McKenney Aug. 11, 2022, 3:46 a.m. UTC | #5
On Wed, Aug 10, 2022 at 11:22:13PM -0400, Joel Fernandes wrote:
> 
> 
> On 8/10/2022 10:51 PM, Paul E. McKenney wrote:
> > On Wed, Aug 10, 2022 at 10:31:56PM -0400, Joel Fernandes wrote:
> >>
> >>
> >> On 8/10/2022 10:23 PM, Joel Fernandes wrote:
> >>>
> >>>
> >>> On 8/8/2022 11:45 PM, Joel Fernandes (Google) wrote:
> >>>> Just a refresh of v3 with one additional debug patch. v3's cover letter is here:
> >>>>  https://lore.kernel.org/all/20220713213237.1596225-1-joel@joelfernandes.org/
> >>>>
> >>>> I just started working on this again while I have some time during paternity
> >>>> leave ;-) So I thought I'll just send it out again. No other changes other
> >>>> than that 1 debug patch I added on the top.
> >>>>
> >>>> Next I am going to go refine the power results as mentioned in Paul's comments
> >>>> on the last cover letter.
> >>>
> >>> Side note: Here is another big selling point for call_rcu_lazy().
> >>> Instead of _lazy(), if you just increased jiffies_till_first_fqs, and
> >>> slowed *all* call_rcu() down to achieve the same effect, that would
> >>> affect percpu refcounters switching to atomic-mode, for example.
> >>>
> >>> They switch to atomic mode by calling __percpu_ref_switch_mode() which
> >>> is called by percpu_ref_switch_to_atomic_sync().>
> >>> This will slow this call down for the full lazy duration which will slow
> >>> down suspend in blk_pre_runtime_suspend().
> >>
> >> Correction while I am going on the record (got to be careful these
> >> days). It *might* slow down RCU for the full lazy duration, unless of
> >> course a fly-by rescue call_rcu() comes in.
> > 
> > Just unload a module, which if I remember correctly invokes rcu_barrier().
> > Lots of rescue callbacks.  ;-)
> 
> Haha. Yes I suppose the per-cpu atomic switch paths can also invoke
> rcu_barrier() but I suspect somebody might complain about IPIs :-P

There is always a critic!  ;-)

							Thanx, Paul