Message ID | 1631635551-8583-1-git-send-email-wang.yong12@zte.com.cn (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] vmpressure: wake up work only when there is registration event | expand |
yongw.pur@gmail.com writes: >From: wangyong <wang.yong12@zte.com.cn> > >Use the global variable num_events to record the number of vmpressure >events registered by the system, and wake up work only when there is >registration event. >Usually, the vmpressure event is not registered in the system, this patch >can avoid waking up work and doing nothing. > >Test with 5.14.0-rc5-next-20210813 on x86_64 4G ram. >Consume cgroup memory until it is about to be reclaimed, then execute >"perf stat -I 2000 malloc.out" command to trigger memory reclamation >and get performance results. >The context-switches is reduced by about 20 times. > >unpatched: >Average of 10 test results >582.4674048 task-clock(msec) >19910.8 context-switches >0 cpu-migrations >1292.9 page-faults >414784733.1 cycles ><not supported> stalled-cycles-frontend ><not supported> stalled-cycles-backend >580070698.4 instructions >125572244.7 branches >2073541.2 branch-misses > >patched >Average of 10 test results >973.6174796 task-clock(msec) >988.6 context-switches >0 cpu-migrations >1785.2 page-faults >772883602.4 cycles ><not supported> stalled-cycles-frontend ><not supported> stalled-cycles-backend >1360280911 instructions >290519434.9 branches >3378378.2 branch-misses > >Tested-by: Zeal Robot <zealci@zte.com.cn> That's not how Tested-by works. Tested-by is for human testers who have actively understand and have validated the effects of the code, not CI: please remove the tag. >Signed-off-by: wangyong <wang.yong12@zte.com.cn> >--- > >Changes since v1: >-Use static_key type data as global variable >-Make event registration judgment earlier > > mm/vmpressure.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > >diff --git a/mm/vmpressure.c b/mm/vmpressure.c >index 76518e4..6f4e984 100644 >--- a/mm/vmpressure.c >+++ b/mm/vmpressure.c >@@ -67,6 +67,11 @@ static const unsigned int vmpressure_level_critical = 95; > */ > static const unsigned int vmpressure_level_critical_prio = ilog2(100 / 10); > >+/* >+ * Count the number of vmpressure events registered in the system. >+ */ >+DEFINE_STATIC_KEY_FALSE(num_events); >+ > static struct vmpressure *work_to_vmpressure(struct work_struct *work) > { > return container_of(work, struct vmpressure, work); >@@ -272,6 +277,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree, > return; > > if (tree) { >+ if (!static_branch_unlikely(&num_events)) >+ return; >+ > spin_lock(&vmpr->sr_lock); > scanned = vmpr->tree_scanned += scanned; > vmpr->tree_reclaimed += reclaimed; >@@ -407,6 +415,7 @@ int vmpressure_register_event(struct mem_cgroup *memcg, > mutex_lock(&vmpr->events_lock); > list_add(&ev->node, &vmpr->events); > mutex_unlock(&vmpr->events_lock); >+ static_branch_inc(&num_events); > ret = 0; > out: > kfree(spec_orig); >@@ -435,6 +444,7 @@ void vmpressure_unregister_event(struct mem_cgroup *memcg, > if (ev->efd != eventfd) > continue; > list_del(&ev->node); >+ static_branch_dec(&num_events); > kfree(ev); > break; > } >-- >2.7.4 > >
On Tue 14-09-21 09:05:51, yongw.pur@gmail.com wrote: > From: wangyong <wang.yong12@zte.com.cn> > > Use the global variable num_events to record the number of vmpressure > events registered by the system, and wake up work only when there is > registration event. > Usually, the vmpressure event is not registered in the system, this patch > can avoid waking up work and doing nothing. I have asked in the previous version and this changelog doesn't that explain again. Why don't you simply bail out early in vmpressure() entry? > Test with 5.14.0-rc5-next-20210813 on x86_64 4G ram. > Consume cgroup memory until it is about to be reclaimed, then execute > "perf stat -I 2000 malloc.out" command to trigger memory reclamation > and get performance results. > The context-switches is reduced by about 20 times. Is this test somewhere available so that it can be reproduced by others. Also while the number of context switches can be an interesting it is not really clear from this evaluation whether that actually matters or not. E.g. what does an increase of task-clock and twice as many instructions recorded tell us? > unpatched: > Average of 10 test results > 582.4674048 task-clock(msec) > 19910.8 context-switches > 0 cpu-migrations > 1292.9 page-faults > 414784733.1 cycles > <not supported> stalled-cycles-frontend > <not supported> stalled-cycles-backend Why is this a part of the data? > 580070698.4 instructions > 125572244.7 branches > 2073541.2 branch-misses > > patched > Average of 10 test results > 973.6174796 task-clock(msec) > 988.6 context-switches > 0 cpu-migrations > 1785.2 page-faults > 772883602.4 cycles > <not supported> stalled-cycles-frontend > <not supported> stalled-cycles-backend > 1360280911 instructions > 290519434.9 branches > 3378378.2 branch-misses > > Tested-by: Zeal Robot <zealci@zte.com.cn> > Signed-off-by: wangyong <wang.yong12@zte.com.cn> > --- > [...] > @@ -272,6 +277,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree, > return; > > if (tree) { > + if (!static_branch_unlikely(&num_events)) > + return; We usually hide the change behind a static inline helper (e.g. vmpressure_disabled()). I would also put it to the beginning of vmpressure or put an explanation why it makes sense only in this branch.
Michal Hocko <mhocko@suse.com> 于2021年9月15日周三 下午8:42写道: > > On Tue 14-09-21 09:05:51, yongw.pur@gmail.com wrote: > > From: wangyong <wang.yong12@zte.com.cn> > > > > Use the global variable num_events to record the number of vmpressure > > events registered by the system, and wake up work only when there is > > registration event. > > Usually, the vmpressure event is not registered in the system, this patch > > can avoid waking up work and doing nothing. > > I have asked in the previous version and this changelog doesn't that > explain again. Why don't you simply bail out early in vmpressure() > entry? Thanks for your reply. Because the else branch will modify the socket_pressure, and will not wake up the work. It is necessary to judge the tree parameters at the same time, like this if (tree && !static_branch_unlikely(&num_events)) return; It's not good to judge the tree twice parameters in the function. > > > Test with 5.14.0-rc5-next-20210813 on x86_64 4G ram. > > Consume cgroup memory until it is about to be reclaimed, then execute > > "perf stat -I 2000 malloc.out" command to trigger memory reclamation > > and get performance results. > > The context-switches is reduced by about 20 times. > > Is this test somewhere available so that it can be reproduced by > others. Also while the number of context switches can be an interesting > it is not really clear from this evaluation whether that actually > matters or not. E.g. what does an increase of task-clock and twice as > many instructions recorded tell us? > The test program is a simple malloc process, which allocate memory and fill some data. I think it may be that more instructions can be executed per unit time. > > unpatched: > > Average of 10 test results > > 582.4674048 task-clock(msec) > > 19910.8 context-switches > > 0 cpu-migrations > > 1292.9 page-faults > > 414784733.1 cycles > > > <not supported> stalled-cycles-frontend > > <not supported> stalled-cycles-backend > > Why is this a part of the data? > > > 580070698.4 instructions > > 125572244.7 branches > > 2073541.2 branch-misses > > > > patched > > Average of 10 test results > > 973.6174796 task-clock(msec) > > 988.6 context-switches > > 0 cpu-migrations > > 1785.2 page-faults > > 772883602.4 cycles > > <not supported> stalled-cycles-frontend > > <not supported> stalled-cycles-backend > > 1360280911 instructions > > 290519434.9 branches > > 3378378.2 branch-misses > > > > Tested-by: Zeal Robot <zealci@zte.com.cn> > > Signed-off-by: wangyong <wang.yong12@zte.com.cn> > > --- > > > [...] > > @@ -272,6 +277,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree, > > return; > > > > if (tree) { > > + if (!static_branch_unlikely(&num_events)) > > + return; > > We usually hide the change behind a static inline helper (e.g. > vmpressure_disabled()). I would also put it to the beginning of > vmpressure or put an explanation why it makes sense only in this branch. > -- Because only this branch needs to wake up work. Yes, static inline helper is more easier to read and understand.
diff --git a/mm/vmpressure.c b/mm/vmpressure.c index 76518e4..6f4e984 100644 --- a/mm/vmpressure.c +++ b/mm/vmpressure.c @@ -67,6 +67,11 @@ static const unsigned int vmpressure_level_critical = 95; */ static const unsigned int vmpressure_level_critical_prio = ilog2(100 / 10); +/* + * Count the number of vmpressure events registered in the system. + */ +DEFINE_STATIC_KEY_FALSE(num_events); + static struct vmpressure *work_to_vmpressure(struct work_struct *work) { return container_of(work, struct vmpressure, work); @@ -272,6 +277,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree, return; if (tree) { + if (!static_branch_unlikely(&num_events)) + return; + spin_lock(&vmpr->sr_lock); scanned = vmpr->tree_scanned += scanned; vmpr->tree_reclaimed += reclaimed; @@ -407,6 +415,7 @@ int vmpressure_register_event(struct mem_cgroup *memcg, mutex_lock(&vmpr->events_lock); list_add(&ev->node, &vmpr->events); mutex_unlock(&vmpr->events_lock); + static_branch_inc(&num_events); ret = 0; out: kfree(spec_orig); @@ -435,6 +444,7 @@ void vmpressure_unregister_event(struct mem_cgroup *memcg, if (ev->efd != eventfd) continue; list_del(&ev->node); + static_branch_dec(&num_events); kfree(ev); break; }