diff mbox series

[v2] vmpressure: wake up work only when there is registration event

Message ID 1631635551-8583-1-git-send-email-wang.yong12@zte.com.cn (mailing list archive)
State New
Headers show
Series [v2] vmpressure: wake up work only when there is registration event | expand

Commit Message

yong w Sept. 14, 2021, 4:05 p.m. UTC
From: wangyong <wang.yong12@zte.com.cn>

Use the global variable num_events to record the number of vmpressure
events registered by the system, and wake up work only when there is
registration event.
Usually, the vmpressure event is not registered in the system, this patch
can avoid waking up work and doing nothing.

Test with 5.14.0-rc5-next-20210813 on x86_64 4G ram.
Consume cgroup memory until it is about to be reclaimed, then execute
"perf stat -I 2000 malloc.out" command to trigger memory reclamation
and get performance results.
The context-switches is reduced by about 20 times.

unpatched:
Average of 10 test results
582.4674048	task-clock(msec)
19910.8		context-switches
0		cpu-migrations
1292.9		page-faults
414784733.1	cycles
<not supported>	stalled-cycles-frontend
<not supported>	stalled-cycles-backend
580070698.4	instructions
125572244.7	branches
2073541.2	branch-misses

patched
Average of 10 test results
973.6174796	task-clock(msec)
988.6		context-switches
0		cpu-migrations
1785.2		page-faults
772883602.4	cycles
<not supported>	stalled-cycles-frontend
<not supported>	stalled-cycles-backend
1360280911	instructions
290519434.9	branches
3378378.2	branch-misses

Tested-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: wangyong <wang.yong12@zte.com.cn>
---

Changes since v1:
-Use static_key type data as global variable
-Make event registration judgment earlier

 mm/vmpressure.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Chris Down Sept. 14, 2021, 4:49 p.m. UTC | #1
yongw.pur@gmail.com writes:
>From: wangyong <wang.yong12@zte.com.cn>
>
>Use the global variable num_events to record the number of vmpressure
>events registered by the system, and wake up work only when there is
>registration event.
>Usually, the vmpressure event is not registered in the system, this patch
>can avoid waking up work and doing nothing.
>
>Test with 5.14.0-rc5-next-20210813 on x86_64 4G ram.
>Consume cgroup memory until it is about to be reclaimed, then execute
>"perf stat -I 2000 malloc.out" command to trigger memory reclamation
>and get performance results.
>The context-switches is reduced by about 20 times.
>
>unpatched:
>Average of 10 test results
>582.4674048	task-clock(msec)
>19910.8		context-switches
>0		cpu-migrations
>1292.9		page-faults
>414784733.1	cycles
><not supported>	stalled-cycles-frontend
><not supported>	stalled-cycles-backend
>580070698.4	instructions
>125572244.7	branches
>2073541.2	branch-misses
>
>patched
>Average of 10 test results
>973.6174796	task-clock(msec)
>988.6		context-switches
>0		cpu-migrations
>1785.2		page-faults
>772883602.4	cycles
><not supported>	stalled-cycles-frontend
><not supported>	stalled-cycles-backend
>1360280911	instructions
>290519434.9	branches
>3378378.2	branch-misses
>
>Tested-by: Zeal Robot <zealci@zte.com.cn>

That's not how Tested-by works. Tested-by is for human testers who have 
actively understand and have validated the effects of the code, not CI: please 
remove the tag.

>Signed-off-by: wangyong <wang.yong12@zte.com.cn>
>---
>
>Changes since v1:
>-Use static_key type data as global variable
>-Make event registration judgment earlier
>
> mm/vmpressure.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
>diff --git a/mm/vmpressure.c b/mm/vmpressure.c
>index 76518e4..6f4e984 100644
>--- a/mm/vmpressure.c
>+++ b/mm/vmpressure.c
>@@ -67,6 +67,11 @@ static const unsigned int vmpressure_level_critical = 95;
>  */
> static const unsigned int vmpressure_level_critical_prio = ilog2(100 / 10);
>
>+/*
>+ * Count the number of vmpressure events registered in the system.
>+ */
>+DEFINE_STATIC_KEY_FALSE(num_events);
>+
> static struct vmpressure *work_to_vmpressure(struct work_struct *work)
> {
> 	return container_of(work, struct vmpressure, work);
>@@ -272,6 +277,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
> 		return;
>
> 	if (tree) {
>+		if (!static_branch_unlikely(&num_events))
>+			return;
>+
> 		spin_lock(&vmpr->sr_lock);
> 		scanned = vmpr->tree_scanned += scanned;
> 		vmpr->tree_reclaimed += reclaimed;
>@@ -407,6 +415,7 @@ int vmpressure_register_event(struct mem_cgroup *memcg,
> 	mutex_lock(&vmpr->events_lock);
> 	list_add(&ev->node, &vmpr->events);
> 	mutex_unlock(&vmpr->events_lock);
>+	static_branch_inc(&num_events);
> 	ret = 0;
> out:
> 	kfree(spec_orig);
>@@ -435,6 +444,7 @@ void vmpressure_unregister_event(struct mem_cgroup *memcg,
> 		if (ev->efd != eventfd)
> 			continue;
> 		list_del(&ev->node);
>+		static_branch_dec(&num_events);
> 		kfree(ev);
> 		break;
> 	}
>-- 
>2.7.4
>
>
Michal Hocko Sept. 15, 2021, 12:42 p.m. UTC | #2
On Tue 14-09-21 09:05:51, yongw.pur@gmail.com wrote:
> From: wangyong <wang.yong12@zte.com.cn>
> 
> Use the global variable num_events to record the number of vmpressure
> events registered by the system, and wake up work only when there is
> registration event.
> Usually, the vmpressure event is not registered in the system, this patch
> can avoid waking up work and doing nothing.

I have asked in the previous version and this changelog doesn't that
explain again. Why don't you simply bail out early in vmpressure()
entry?

> Test with 5.14.0-rc5-next-20210813 on x86_64 4G ram.
> Consume cgroup memory until it is about to be reclaimed, then execute
> "perf stat -I 2000 malloc.out" command to trigger memory reclamation
> and get performance results.
> The context-switches is reduced by about 20 times.

Is this test somewhere available so that it can be reproduced by
others. Also while the number of context switches can be an interesting
it is not really clear from this evaluation whether that actually
matters or not. E.g. what does an increase of task-clock and twice as
many instructions recorded tell us?

> unpatched:
> Average of 10 test results
> 582.4674048	task-clock(msec)
> 19910.8		context-switches
> 0		cpu-migrations
> 1292.9		page-faults
> 414784733.1	cycles

> <not supported>	stalled-cycles-frontend
> <not supported>	stalled-cycles-backend

Why is this a part of the data?

> 580070698.4	instructions
> 125572244.7	branches
> 2073541.2	branch-misses
> 
> patched
> Average of 10 test results
> 973.6174796	task-clock(msec)
> 988.6		context-switches
> 0		cpu-migrations
> 1785.2		page-faults
> 772883602.4	cycles
> <not supported>	stalled-cycles-frontend
> <not supported>	stalled-cycles-backend
> 1360280911	instructions
> 290519434.9	branches
> 3378378.2	branch-misses
> 
> Tested-by: Zeal Robot <zealci@zte.com.cn>
> Signed-off-by: wangyong <wang.yong12@zte.com.cn>
> ---
> 
[...]
> @@ -272,6 +277,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
>  		return;
>  
>  	if (tree) {
> +		if (!static_branch_unlikely(&num_events))
> +			return;

We usually hide the change behind a static inline helper (e.g.
vmpressure_disabled()). I would also put it to the beginning of
vmpressure or put an explanation why it makes sense only in this branch.
yong w Sept. 17, 2021, 3:38 p.m. UTC | #3
Michal Hocko <mhocko@suse.com> 于2021年9月15日周三 下午8:42写道:
>
> On Tue 14-09-21 09:05:51, yongw.pur@gmail.com wrote:
> > From: wangyong <wang.yong12@zte.com.cn>
> >
> > Use the global variable num_events to record the number of vmpressure
> > events registered by the system, and wake up work only when there is
> > registration event.
> > Usually, the vmpressure event is not registered in the system, this patch
> > can avoid waking up work and doing nothing.
>
> I have asked in the previous version and this changelog doesn't that
> explain again. Why don't you simply bail out early in vmpressure()
> entry?
Thanks for your reply.
Because the else branch will modify the socket_pressure, and will not wake up
the work. It is necessary to judge the tree parameters at the same
time, like this
if (tree && !static_branch_unlikely(&num_events))
    return;
It's not good to judge the tree twice parameters in the function.
>
> > Test with 5.14.0-rc5-next-20210813 on x86_64 4G ram.
> > Consume cgroup memory until it is about to be reclaimed, then execute
> > "perf stat -I 2000 malloc.out" command to trigger memory reclamation
> > and get performance results.
> > The context-switches is reduced by about 20 times.
>
> Is this test somewhere available so that it can be reproduced by
> others. Also while the number of context switches can be an interesting
> it is not really clear from this evaluation whether that actually
> matters or not. E.g. what does an increase of task-clock and twice as
> many instructions recorded tell us?
>
The test program is a simple malloc  process, which allocate memory
and fill some data.
I think it may be that more instructions can be executed per unit time.
> > unpatched:
> > Average of 10 test results
> > 582.4674048   task-clock(msec)
> > 19910.8               context-switches
> > 0             cpu-migrations
> > 1292.9                page-faults
> > 414784733.1   cycles
>
> > <not supported>       stalled-cycles-frontend
> > <not supported>       stalled-cycles-backend
>
> Why is this a part of the data?
>
> > 580070698.4   instructions
> > 125572244.7   branches
> > 2073541.2     branch-misses
> >
> > patched
> > Average of 10 test results
> > 973.6174796   task-clock(msec)
> > 988.6         context-switches
> > 0             cpu-migrations
> > 1785.2                page-faults
> > 772883602.4   cycles
> > <not supported>       stalled-cycles-frontend
> > <not supported>       stalled-cycles-backend
> > 1360280911    instructions
> > 290519434.9   branches
> > 3378378.2     branch-misses
> >
> > Tested-by: Zeal Robot <zealci@zte.com.cn>
> > Signed-off-by: wangyong <wang.yong12@zte.com.cn>
> > ---
> >
> [...]
> > @@ -272,6 +277,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
> >               return;
> >
> >       if (tree) {
> > +             if (!static_branch_unlikely(&num_events))
> > +                     return;
>
> We usually hide the change behind a static inline helper (e.g.
> vmpressure_disabled()). I would also put it to the beginning of
> vmpressure or put an explanation why it makes sense only in this branch.
> --
Because only this branch needs to wake up work.
Yes, static inline helper is more easier to read and understand.
diff mbox series

Patch

diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index 76518e4..6f4e984 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -67,6 +67,11 @@  static const unsigned int vmpressure_level_critical = 95;
  */
 static const unsigned int vmpressure_level_critical_prio = ilog2(100 / 10);
 
+/*
+ * Count the number of vmpressure events registered in the system.
+ */
+DEFINE_STATIC_KEY_FALSE(num_events);
+
 static struct vmpressure *work_to_vmpressure(struct work_struct *work)
 {
 	return container_of(work, struct vmpressure, work);
@@ -272,6 +277,9 @@  void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
 		return;
 
 	if (tree) {
+		if (!static_branch_unlikely(&num_events))
+			return;
+
 		spin_lock(&vmpr->sr_lock);
 		scanned = vmpr->tree_scanned += scanned;
 		vmpr->tree_reclaimed += reclaimed;
@@ -407,6 +415,7 @@  int vmpressure_register_event(struct mem_cgroup *memcg,
 	mutex_lock(&vmpr->events_lock);
 	list_add(&ev->node, &vmpr->events);
 	mutex_unlock(&vmpr->events_lock);
+	static_branch_inc(&num_events);
 	ret = 0;
 out:
 	kfree(spec_orig);
@@ -435,6 +444,7 @@  void vmpressure_unregister_event(struct mem_cgroup *memcg,
 		if (ev->efd != eventfd)
 			continue;
 		list_del(&ev->node);
+		static_branch_dec(&num_events);
 		kfree(ev);
 		break;
 	}