Message ID | 20240418142008.2775308-2-zhangpeng362@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: convert mm's rss stats to use atomic mode | expand |
On Thu, 18 Apr 2024 22:20:07 +0800 Peng Zhang <zhangpeng362@huawei.com> wrote: > From: ZhangPeng <zhangpeng362@huawei.com> > > Depending on whether counters is NULL, we can support two modes: > atomic mode and perpcu mode. We implement both modes by grouping > the s64 count and atomic64_t count_atomic in a union. At the same time, > we create the interface for adding and reading in atomic mode and for > switching atomic mode to percpu mode. > I think it would be better if we had a detailed code comment in an appropriate place which fully describes the tradeoffs here. Tell people when they would benefit from using one mode versus the other. > --- a/lib/percpu_counter.c > +++ b/lib/percpu_counter.c > @@ -153,7 +153,7 @@ EXPORT_SYMBOL(__percpu_counter_sum); > > int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, > gfp_t gfp, u32 nr_counters, > - struct lock_class_key *key) > + struct lock_class_key *key, bool switch_mode) > { > unsigned long flags __maybe_unused; > size_t counter_size; > @@ -174,7 +174,8 @@ int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, > #ifdef CONFIG_HOTPLUG_CPU > INIT_LIST_HEAD(&fbc[i].list); > #endif > - fbc[i].count = amount; > + if (likely(!switch_mode)) > + fbc[i].count = amount; > fbc[i].counters = (void *)counters + (i * counter_size); > > debug_percpu_counter_activate(&fbc[i]); > @@ -357,6 +358,32 @@ bool __percpu_counter_limited_add(struct percpu_counter *fbc, > return good; > } > > +/* > + * percpu_counter_switch_to_pcpu_many: Converts struct percpu_counters from > + * atomic mode to percpu mode. Describe what happens if that GFP_ATOMIC allocation fails. We remain in atomic mode, yes? > + */ > +int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, > + u32 nr_counters) > +{ > + static struct lock_class_key __key; > + unsigned long flags; > + bool ret = 0; > + > + if (percpu_counter_initialized(fbc)) > + return 0; > + > + preempt_disable(); > + local_irq_save(flags); Do we need both? Does local_irq_save() always disable preemption? This might not be the case for RT kernels, I always forget. > + if (likely(!percpu_counter_initialized(fbc))) > + ret = __percpu_counter_init_many(fbc, 0, > + GFP_ATOMIC|__GFP_NOWARN|__GFP_ZERO, > + nr_counters, &__key, true); > + local_irq_restore(flags); > + preempt_enable(); > + > + return ret; > +} > + Why is there no API for switching back to atomic mode?
On 2024/4/19 3:40, Andrew Morton wrote: > On Thu, 18 Apr 2024 22:20:07 +0800 Peng Zhang <zhangpeng362@huawei.com> wrote: > >> From: ZhangPeng <zhangpeng362@huawei.com> >> >> Depending on whether counters is NULL, we can support two modes: >> atomic mode and perpcu mode. We implement both modes by grouping >> the s64 count and atomic64_t count_atomic in a union. At the same time, >> we create the interface for adding and reading in atomic mode and for >> switching atomic mode to percpu mode. >> > I think it would be better if we had a detailed code comment in an > appropriate place which fully describes the tradeoffs here. Tell > people when they would benefit from using one mode versus the other. > Thanks for your reply! Yes, of course, I would add before the union: /* * Depending on whether counters is NULL, we can support two modes, * atomic mode using count_atomic and perpcu mode using count. * The single-thread processes should use atomic mode to reduce the * memory consumption and performance regression. * The multiple-thread processes should use percpu mode to reduce the * error margin. */ union { s64 count; atomic64_t count_atomic; }; >> --- a/lib/percpu_counter.c >> +++ b/lib/percpu_counter.c >> @@ -153,7 +153,7 @@ EXPORT_SYMBOL(__percpu_counter_sum); >> >> int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, >> gfp_t gfp, u32 nr_counters, >> - struct lock_class_key *key) >> + struct lock_class_key *key, bool switch_mode) >> { >> unsigned long flags __maybe_unused; >> size_t counter_size; >> @@ -174,7 +174,8 @@ int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, >> #ifdef CONFIG_HOTPLUG_CPU >> INIT_LIST_HEAD(&fbc[i].list); >> #endif >> - fbc[i].count = amount; >> + if (likely(!switch_mode)) >> + fbc[i].count = amount; >> fbc[i].counters = (void *)counters + (i * counter_size); >> >> debug_percpu_counter_activate(&fbc[i]); >> @@ -357,6 +358,32 @@ bool __percpu_counter_limited_add(struct percpu_counter *fbc, >> return good; >> } >> >> +/* >> + * percpu_counter_switch_to_pcpu_many: Converts struct percpu_counters from >> + * atomic mode to percpu mode. > Describe what happens if that GFP_ATOMIC allocation fails. We remain > in atomic mode, yes? Yes. I'll add comments like: * Return: 0 if percpu_counter is already in atomic mode or successfully * switched to atomic mode; -ENOMEM if perpcu memory allocation fails, * perpcu_counter is still in atomic mode. >> + */ >> +int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, >> + u32 nr_counters) >> +{ >> + static struct lock_class_key __key; >> + unsigned long flags; >> + bool ret = 0; >> + >> + if (percpu_counter_initialized(fbc)) >> + return 0; >> + >> + preempt_disable(); >> + local_irq_save(flags); > Do we need both? Does local_irq_save() always disable preemption? > This might not be the case for RT kernels, I always forget. Yes, we need both. For RT kernels, local_irq_save() doesn't mean that preemption is disabled. >> + if (likely(!percpu_counter_initialized(fbc))) >> + ret = __percpu_counter_init_many(fbc, 0, >> + GFP_ATOMIC|__GFP_NOWARN|__GFP_ZERO, >> + nr_counters, &__key, true); >> + local_irq_restore(flags); >> + preempt_enable(); >> + >> + return ret; >> +} >> + > Why is there no API for switching back to atomic mode? At least for the current test scenario, we don't need to switch back to atomic mode. The scenario for switching back to atomic mode may be that the multi-threaded process destroys the last second thread. I could add this API if needed later.
On Thu, Apr 18, 2024 at 10:20:07PM +0800, Peng Zhang wrote: > From: ZhangPeng <zhangpeng362@huawei.com> > > Depending on whether counters is NULL, we can support two modes: > atomic mode and perpcu mode. We implement both modes by grouping > the s64 count and atomic64_t count_atomic in a union. At the same time, > we create the interface for adding and reading in atomic mode and for > switching atomic mode to percpu mode. > > Suggested-by: Jan Kara <jack@suse.cz> > Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > include/linux/percpu_counter.h | 43 +++++++++++++++++++++++++++++++--- > lib/percpu_counter.c | 31 ++++++++++++++++++++++-- > 2 files changed, 69 insertions(+), 5 deletions(-) > > diff --git a/include/linux/percpu_counter.h b/include/linux/percpu_counter.h > index 3a44dd1e33d2..160f9734c0bb 100644 > --- a/include/linux/percpu_counter.h > +++ b/include/linux/percpu_counter.h > @@ -21,7 +21,13 @@ > > struct percpu_counter { > raw_spinlock_t lock; > - s64 count; > + /* Depending on whether counters is NULL, we can support two modes, > + * atomic mode using count_atomic and perpcu mode using count. > + */ > + union { > + s64 count; > + atomic64_t count_atomic; > + }; > #ifdef CONFIG_HOTPLUG_CPU > struct list_head list; /* All percpu_counters are on a list */ > #endif > @@ -32,14 +38,14 @@ extern int percpu_counter_batch; > > int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, > gfp_t gfp, u32 nr_counters, > - struct lock_class_key *key); > + struct lock_class_key *key, bool switch_mode); Nit: the lock_class_key at the end. > > #define percpu_counter_init_many(fbc, value, gfp, nr_counters) \ > ({ \ > static struct lock_class_key __key; \ > \ > __percpu_counter_init_many(fbc, value, gfp, nr_counters,\ > - &__key); \ > + &__key, false); \ > }) > > > @@ -130,6 +136,20 @@ static inline bool percpu_counter_initialized(struct percpu_counter *fbc) > return (fbc->counters != NULL); > } > > +static inline s64 percpu_counter_atomic_read(struct percpu_counter *fbc) > +{ > + return atomic64_read(&fbc->count_atomic); > +} > + > +static inline void percpu_counter_atomic_add(struct percpu_counter *fbc, > + s64 amount) > +{ > + atomic64_add(amount, &fbc->count_atomic); > +} > + > +int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, > + u32 nr_counters); > + > #else /* !CONFIG_SMP */ > > struct percpu_counter { > @@ -260,6 +280,23 @@ static inline bool percpu_counter_initialized(struct percpu_counter *fbc) > static inline void percpu_counter_sync(struct percpu_counter *fbc) > { > } > + > +static inline s64 percpu_counter_atomic_read(struct percpu_counter *fbc) > +{ > + return fbc->count; > +} > + > +static inline void percpu_counter_atomic_add(struct percpu_counter *fbc, > + s64 amount) > +{ > + percpu_counter_add(fbc, amount); > +} > + > +static inline int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, > + u32 nr_counters) > +{ > + return 0; > +} > #endif /* CONFIG_SMP */ > > static inline void percpu_counter_inc(struct percpu_counter *fbc) > diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c > index 44dd133594d4..95c4e038051a 100644 > --- a/lib/percpu_counter.c > +++ b/lib/percpu_counter.c > @@ -153,7 +153,7 @@ EXPORT_SYMBOL(__percpu_counter_sum); > > int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, > gfp_t gfp, u32 nr_counters, > - struct lock_class_key *key) > + struct lock_class_key *key, bool switch_mode) > { > unsigned long flags __maybe_unused; > size_t counter_size; > @@ -174,7 +174,8 @@ int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, > #ifdef CONFIG_HOTPLUG_CPU > INIT_LIST_HEAD(&fbc[i].list); > #endif > - fbc[i].count = amount; > + if (likely(!switch_mode)) > + fbc[i].count = amount; > fbc[i].counters = (void *)counters + (i * counter_size); > > debug_percpu_counter_activate(&fbc[i]); > @@ -357,6 +358,32 @@ bool __percpu_counter_limited_add(struct percpu_counter *fbc, > return good; > } > > +/* > + * percpu_counter_switch_to_pcpu_many: Converts struct percpu_counters from > + * atomic mode to percpu mode. > + */ > +int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, > + u32 nr_counters) > +{ > + static struct lock_class_key __key; This is an improper use of lockdep. Now all of the percpu_counters initialized on this path will key off of this lock_class_key. > + unsigned long flags; > + bool ret = 0; > + > + if (percpu_counter_initialized(fbc)) > + return 0; > + > + preempt_disable(); > + local_irq_save(flags); > + if (likely(!percpu_counter_initialized(fbc))) > + ret = __percpu_counter_init_many(fbc, 0, > + GFP_ATOMIC|__GFP_NOWARN|__GFP_ZERO, > + nr_counters, &__key, true); > + local_irq_restore(flags); > + preempt_enable(); > + > + return ret; > +} I'm staring at this API and I'm not in love with it. I think it hinges on that there's one user of mm_stats prior hence it's safe. Generically though, I can't see why this is safe. I need to give this a little more thought, but my gut reaction is I'd rather this look like percpu_refcount where we can init dead minus the percpu allocation. Maybe that's not quite right, but I'd feel better about it than requiring external synchronization on a percpu_counter to ensure that it's correct. > + > static int __init percpu_counter_startup(void) > { > int ret; > -- > 2.25.1 > Thanks, Dennis
On 2024/4/26 16:11, Dennis Zhou wrote: > On Thu, Apr 18, 2024 at 10:20:07PM +0800, Peng Zhang wrote: >> From: ZhangPeng <zhangpeng362@huawei.com> >> >> Depending on whether counters is NULL, we can support two modes: >> atomic mode and perpcu mode. We implement both modes by grouping >> the s64 count and atomic64_t count_atomic in a union. At the same time, >> we create the interface for adding and reading in atomic mode and for >> switching atomic mode to percpu mode. >> >> Suggested-by: Jan Kara <jack@suse.cz> >> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> >> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >> --- >> include/linux/percpu_counter.h | 43 +++++++++++++++++++++++++++++++--- >> lib/percpu_counter.c | 31 ++++++++++++++++++++++-- >> 2 files changed, 69 insertions(+), 5 deletions(-) >> >> diff --git a/include/linux/percpu_counter.h b/include/linux/percpu_counter.h >> index 3a44dd1e33d2..160f9734c0bb 100644 >> --- a/include/linux/percpu_counter.h >> +++ b/include/linux/percpu_counter.h >> @@ -21,7 +21,13 @@ >> >> struct percpu_counter { >> raw_spinlock_t lock; >> - s64 count; >> + /* Depending on whether counters is NULL, we can support two modes, >> + * atomic mode using count_atomic and perpcu mode using count. >> + */ >> + union { >> + s64 count; >> + atomic64_t count_atomic; >> + }; >> #ifdef CONFIG_HOTPLUG_CPU >> struct list_head list; /* All percpu_counters are on a list */ >> #endif >> @@ -32,14 +38,14 @@ extern int percpu_counter_batch; >> >> int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, >> gfp_t gfp, u32 nr_counters, >> - struct lock_class_key *key); >> + struct lock_class_key *key, bool switch_mode); > Nit: the lock_class_key at the end. Got it. >> >> #define percpu_counter_init_many(fbc, value, gfp, nr_counters) \ >> ({ \ >> static struct lock_class_key __key; \ >> \ >> __percpu_counter_init_many(fbc, value, gfp, nr_counters,\ >> - &__key); \ >> + &__key, false); \ >> }) >> >> >> @@ -130,6 +136,20 @@ static inline bool percpu_counter_initialized(struct percpu_counter *fbc) >> return (fbc->counters != NULL); >> } >> >> +static inline s64 percpu_counter_atomic_read(struct percpu_counter *fbc) >> +{ >> + return atomic64_read(&fbc->count_atomic); >> +} >> + >> +static inline void percpu_counter_atomic_add(struct percpu_counter *fbc, >> + s64 amount) >> +{ >> + atomic64_add(amount, &fbc->count_atomic); >> +} >> + >> +int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, >> + u32 nr_counters); >> + >> #else /* !CONFIG_SMP */ >> >> struct percpu_counter { >> @@ -260,6 +280,23 @@ static inline bool percpu_counter_initialized(struct percpu_counter *fbc) >> static inline void percpu_counter_sync(struct percpu_counter *fbc) >> { >> } >> + >> +static inline s64 percpu_counter_atomic_read(struct percpu_counter *fbc) >> +{ >> + return fbc->count; >> +} >> + >> +static inline void percpu_counter_atomic_add(struct percpu_counter *fbc, >> + s64 amount) >> +{ >> + percpu_counter_add(fbc, amount); >> +} >> + >> +static inline int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, >> + u32 nr_counters) >> +{ >> + return 0; >> +} >> #endif /* CONFIG_SMP */ >> >> static inline void percpu_counter_inc(struct percpu_counter *fbc) >> diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c >> index 44dd133594d4..95c4e038051a 100644 >> --- a/lib/percpu_counter.c >> +++ b/lib/percpu_counter.c >> @@ -153,7 +153,7 @@ EXPORT_SYMBOL(__percpu_counter_sum); >> >> int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, >> gfp_t gfp, u32 nr_counters, >> - struct lock_class_key *key) >> + struct lock_class_key *key, bool switch_mode) >> { >> unsigned long flags __maybe_unused; >> size_t counter_size; >> @@ -174,7 +174,8 @@ int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, >> #ifdef CONFIG_HOTPLUG_CPU >> INIT_LIST_HEAD(&fbc[i].list); >> #endif >> - fbc[i].count = amount; >> + if (likely(!switch_mode)) >> + fbc[i].count = amount; >> fbc[i].counters = (void *)counters + (i * counter_size); >> >> debug_percpu_counter_activate(&fbc[i]); >> @@ -357,6 +358,32 @@ bool __percpu_counter_limited_add(struct percpu_counter *fbc, >> return good; >> } >> >> +/* >> + * percpu_counter_switch_to_pcpu_many: Converts struct percpu_counters from >> + * atomic mode to percpu mode. >> + */ >> +int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, >> + u32 nr_counters) >> +{ >> + static struct lock_class_key __key; > This is an improper use of lockdep. Now all of the percpu_counters > initialized on this path will key off of this lock_class_key. Sorry, I don't know much about lock_class_key. I may not understand the reason why it's not appropriate here. Could you please help me with the details? >> + unsigned long flags; >> + bool ret = 0; >> + >> + if (percpu_counter_initialized(fbc)) >> + return 0; >> + >> + preempt_disable(); >> + local_irq_save(flags); >> + if (likely(!percpu_counter_initialized(fbc))) >> + ret = __percpu_counter_init_many(fbc, 0, >> + GFP_ATOMIC|__GFP_NOWARN|__GFP_ZERO, >> + nr_counters, &__key, true); >> + local_irq_restore(flags); >> + preempt_enable(); >> + >> + return ret; >> +} > I'm staring at this API and I'm not in love with it. I think it hinges > on that there's one user of mm_stats prior hence it's safe. Generically > though, I can't see why this is safe. > > I need to give this a little more thought, but my gut reaction is I'd > rather this look like percpu_refcount where we can init dead minus the > percpu allocation. Maybe that's not quite right, but I'd feel better > about it than requiring external synchronization on a percpu_counter to > ensure that it's correct. Maybe it would be better if I change this API to the internal function of mm counter. Sorry again, maybe percpu_refcount is better, but I don't understand this sentence "we can init dead minus the percpu allocation." Please forgive my ignorance... Thank you very much for your review and valuable comments! >> + >> static int __init percpu_counter_startup(void) >> { >> int ret; >> -- >> 2.25.1 >> > Thanks, > Dennis
diff --git a/include/linux/percpu_counter.h b/include/linux/percpu_counter.h index 3a44dd1e33d2..160f9734c0bb 100644 --- a/include/linux/percpu_counter.h +++ b/include/linux/percpu_counter.h @@ -21,7 +21,13 @@ struct percpu_counter { raw_spinlock_t lock; - s64 count; + /* Depending on whether counters is NULL, we can support two modes, + * atomic mode using count_atomic and perpcu mode using count. + */ + union { + s64 count; + atomic64_t count_atomic; + }; #ifdef CONFIG_HOTPLUG_CPU struct list_head list; /* All percpu_counters are on a list */ #endif @@ -32,14 +38,14 @@ extern int percpu_counter_batch; int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, gfp_t gfp, u32 nr_counters, - struct lock_class_key *key); + struct lock_class_key *key, bool switch_mode); #define percpu_counter_init_many(fbc, value, gfp, nr_counters) \ ({ \ static struct lock_class_key __key; \ \ __percpu_counter_init_many(fbc, value, gfp, nr_counters,\ - &__key); \ + &__key, false); \ }) @@ -130,6 +136,20 @@ static inline bool percpu_counter_initialized(struct percpu_counter *fbc) return (fbc->counters != NULL); } +static inline s64 percpu_counter_atomic_read(struct percpu_counter *fbc) +{ + return atomic64_read(&fbc->count_atomic); +} + +static inline void percpu_counter_atomic_add(struct percpu_counter *fbc, + s64 amount) +{ + atomic64_add(amount, &fbc->count_atomic); +} + +int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, + u32 nr_counters); + #else /* !CONFIG_SMP */ struct percpu_counter { @@ -260,6 +280,23 @@ static inline bool percpu_counter_initialized(struct percpu_counter *fbc) static inline void percpu_counter_sync(struct percpu_counter *fbc) { } + +static inline s64 percpu_counter_atomic_read(struct percpu_counter *fbc) +{ + return fbc->count; +} + +static inline void percpu_counter_atomic_add(struct percpu_counter *fbc, + s64 amount) +{ + percpu_counter_add(fbc, amount); +} + +static inline int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, + u32 nr_counters) +{ + return 0; +} #endif /* CONFIG_SMP */ static inline void percpu_counter_inc(struct percpu_counter *fbc) diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c index 44dd133594d4..95c4e038051a 100644 --- a/lib/percpu_counter.c +++ b/lib/percpu_counter.c @@ -153,7 +153,7 @@ EXPORT_SYMBOL(__percpu_counter_sum); int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, gfp_t gfp, u32 nr_counters, - struct lock_class_key *key) + struct lock_class_key *key, bool switch_mode) { unsigned long flags __maybe_unused; size_t counter_size; @@ -174,7 +174,8 @@ int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, #ifdef CONFIG_HOTPLUG_CPU INIT_LIST_HEAD(&fbc[i].list); #endif - fbc[i].count = amount; + if (likely(!switch_mode)) + fbc[i].count = amount; fbc[i].counters = (void *)counters + (i * counter_size); debug_percpu_counter_activate(&fbc[i]); @@ -357,6 +358,32 @@ bool __percpu_counter_limited_add(struct percpu_counter *fbc, return good; } +/* + * percpu_counter_switch_to_pcpu_many: Converts struct percpu_counters from + * atomic mode to percpu mode. + */ +int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, + u32 nr_counters) +{ + static struct lock_class_key __key; + unsigned long flags; + bool ret = 0; + + if (percpu_counter_initialized(fbc)) + return 0; + + preempt_disable(); + local_irq_save(flags); + if (likely(!percpu_counter_initialized(fbc))) + ret = __percpu_counter_init_many(fbc, 0, + GFP_ATOMIC|__GFP_NOWARN|__GFP_ZERO, + nr_counters, &__key, true); + local_irq_restore(flags); + preempt_enable(); + + return ret; +} + static int __init percpu_counter_startup(void) { int ret;