mbox series

[v8,00/13] rcu: call_rcu() power improvements

Message ID 20221011180142.2742289-1-joel@joelfernandes.org (mailing list archive)
Headers show
Series rcu: call_rcu() power improvements | expand

Message

Joel Fernandes Oct. 11, 2022, 6:01 p.m. UTC
v8 version of RCU lazy patches based on rcu/next branch. Very small mostly
cosmetic changes since v7, the one exception being the rxrpc patch.

I will post the tracing patches later along with rcutop as I need to add new
tracepoints that Frederic suggested.

Main recent changes:
1. rcu_barrier() wake up only for lazy bypass list.
2. Make all call_rcu() default-lazy and add call_rcu_flush() API.
3. Take care of some callers using call_rcu_flush() API.
4. Several refactorings suggested by Paul/Frederic.
5. New call_rcu() to call_rcu_flush() conversions by Joel/Vlad/Paul.
6. New scripts in cover-letter to check for callbacks doing wake-ups.

I am seeing good performance and power with these patches on real ChromeOS x86
asymmetric hardware.

Earlier cover letter with lots of details is here:
https://lore.kernel.org/all/20220901221720.1105021-1-joel@joelfernandes.org/

List of recent changes:
    
    [ Frederic Weisbec: Program the lazy timer only if WAKE_NOT, since other
      deferral levels wake much earlier so for those it is not needed. ]
    
    [ Frederic Weisbec: Use flush flags to keep bypass API code clean. ]
    
    [ Frederic Weisbec: Make rcu_barrier() wake up only if main list empty. ]
    
    [ Frederic Weisbec: Remove extra 'else if' branch in rcu_nocb_try_bypass(). ]
    
    [ Joel: Fix issue where I was not resetting lazy_len after moving it to rdp ]
    
    [ Paul/Thomas/Joel: Make call_rcu() default lazy so users don't mess up. ]
    
    [ Paul/Frederic : Cosmetic changes, split out wakeup of nocb thread. ]
    
    [ Vlad/Joel : More call_rcu -> flush conversions ]

    [ debug code for detecting "wake" in kernel's call_rcu() callbacks. ]

The following 2 scripts can be used to check if any callbacks in the kernel are
doing a wake up (it is best effort and may miss some things, but we found
issues using it)

1. Script to search for call_rcu() references and dump the callback list to a file:
#!/bin/bash

rm func-list
touch func-list

for f in $(find . \( -name "*.c" -o -name "*.h" \) | grep -v rcu); do

	funcs=$(perl -0777 -ne 'while(m/call_rcu\([&]?.+,\s?(.+)\).*;/g){print "$1\n";}' $f)

	if [ "x$funcs" != "x" ]; then
		for func in $funcs; do
			echo "$f $func" >> func-list
			echo "$f $func"
		done
	fi

done

cat func-list | sort | uniq | tee func-list-sorted

2. Script to search "wake" after callback references:

#!/bin/bash

while read fl; do
	file=$(echo $fl | cut -d " " -f1)
	func=$(echo $fl | cut -d " " -f2)

	grep -A 30 $func $file | grep wake > /dev/null

	if [ $? -eq 0 ]; then
		echo "keyword wake found after function reference $func in $file"
		echo "Output:"
		grep -A 30 $func $file 
		echo "==========================================================="
	fi
done < func-list-sorted

Frederic Weisbecker (1):
rcu: Fix missing nocb gp wake on rcu_barrier()

Joel Fernandes (Google) (9):
rcu: Make call_rcu() lazy to save power
rcu: Refactor code a bit in rcu_nocb_do_flush_bypass()
rcuscale: Add laziness and kfree tests
percpu-refcount: Use call_rcu_flush() for atomic switch
rcu/sync: Use call_rcu_flush() instead of call_rcu
rcu/rcuscale: Use call_rcu_flush() for async reader test
rcu/rcutorture: Use call_rcu_flush() where needed
rxrpc: Use call_rcu_flush() instead of call_rcu()
rcu/debug: Add wake-up debugging for lazy callbacks

Uladzislau Rezki (2):
scsi/scsi_error: Use call_rcu_flush() instead of call_rcu()
workqueue: Make queue_rcu_work() use call_rcu_flush()

Vineeth Pillai (1):
rcu: shrinker for lazy rcu

drivers/scsi/scsi_error.c |   2 +-
include/linux/rcupdate.h  |   7 ++
kernel/rcu/Kconfig        |  15 +++
kernel/rcu/lazy-debug.h   | 154 +++++++++++++++++++++++++++
kernel/rcu/rcu.h          |   8 ++
kernel/rcu/rcuscale.c     |  70 ++++++++++++-
kernel/rcu/rcutorture.c   |  16 +--
kernel/rcu/sync.c         |   2 +-
kernel/rcu/tiny.c         |   2 +-
kernel/rcu/tree.c         | 144 +++++++++++++++++--------
kernel/rcu/tree.h         |  12 ++-
kernel/rcu/tree_exp.h     |   2 +-
kernel/rcu/tree_nocb.h    | 215 +++++++++++++++++++++++++++++++++-----
kernel/workqueue.c        |   2 +-
lib/percpu-refcount.c     |   3 +-
net/rxrpc/conn_object.c   |   2 +-
16 files changed, 559 insertions(+), 97 deletions(-)
create mode 100644 kernel/rcu/lazy-debug.h

--
2.38.0.rc1.362.ged0d419d3c-goog