Message ID | 20240722070810.46016-1-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: kmem: add lockdep assertion to obj_cgroup_memcg | expand |
On 2024/7/22 15:08, Muchun Song wrote: > The obj_cgroup_memcg() is supposed to safe to prevent the returned > memory cgroup from being freed only when the caller is holding the > rcu read lock or objcg_lock or cgroup_mutex. It is very easy to > ignore thoes conditions when users call some upper APIs which call > obj_cgroup_memcg() internally like mem_cgroup_from_slab_obj() (See > the link below). So it is better to add lockdep assertion to > obj_cgroup_memcg() to find those issues ASAP. Yeah, some users care about the lifetime of returned memcg, while some other users maybe not. Maybe a dumb question, can we just make objcg hold the refcount of its pointed memcg? So the users of that objcg don't need to care about the refcount of memcg? (We could switch the refcount from old memcg to the new memcg when objcg switch memcg pointer, right?) Thanks. > > Because there is no user of obj_cgroup_memcg() holding objcg_lock > to make the returned memory cgroup safe, do not add objcg_lock > assertion (We should export objcg_lock if we really want to do) > and leave a comment to indicate it is intentional. > > Some users like __mem_cgroup_uncharge() do not care the lifetime > of the returned memory cgroup, which just want to know if the > folio is charged to a memory cgroup, therefore, they do not need > to hold the needed locks. In which case, introduce a new helper > folio_memcg_charged() to do this. Compare it to folio_memcg(), it > could eliminate a memory access of objcg->memcg for kmem, actually, > a really small gain. > > Link: https://lore.kernel.org/all/20240718083607.42068-1-songmuchun@bytedance.com/ > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > --- > include/linux/memcontrol.h | 22 +++++++++++++++++++--- > mm/memcontrol.c | 6 +++--- > 2 files changed, 22 insertions(+), 6 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index fc94879db4dff..d616c50025098 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -360,11 +360,13 @@ static inline bool folio_memcg_kmem(struct folio *folio); > * After the initialization objcg->memcg is always pointing at > * a valid memcg, but can be atomically swapped to the parent memcg. > * > - * The caller must ensure that the returned memcg won't be released: > - * e.g. acquire the rcu_read_lock or css_set_lock. > + * The caller must ensure that the returned memcg won't be released. > */ > static inline struct mem_cgroup *obj_cgroup_memcg(struct obj_cgroup *objcg) > { > + WARN_ON_ONCE(!rcu_read_lock_held() && > + /* !lockdep_is_held(&objcg_lock) && */ > + !lockdep_is_held(&cgroup_mutex)); > return READ_ONCE(objcg->memcg); > } > > @@ -438,6 +440,19 @@ static inline struct mem_cgroup *folio_memcg(struct folio *folio) > return __folio_memcg(folio); > } > > +/* > + * folio_memcg_charged - If a folio is charged to a memory cgroup. > + * @folio: Pointer to the folio. > + * > + * Returns true if folio is charged to a memory cgroup, otherwise returns false. > + */ > +static inline bool folio_memcg_charged(struct folio *folio) > +{ > + if (folio_memcg_kmem(folio)) > + return __folio_objcg(folio) != NULL; > + return __folio_memcg(folio) != NULL; > +} > + > /** > * folio_memcg_rcu - Locklessly get the memory cgroup associated with a folio. > * @folio: Pointer to the folio. > @@ -454,7 +469,6 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio) > unsigned long memcg_data = READ_ONCE(folio->memcg_data); > > VM_BUG_ON_FOLIO(folio_test_slab(folio), folio); > - WARN_ON_ONCE(!rcu_read_lock_held()); > > if (memcg_data & MEMCG_DATA_KMEM) { > struct obj_cgroup *objcg; > @@ -463,6 +477,8 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio) > return obj_cgroup_memcg(objcg); > } > > + WARN_ON_ONCE(!rcu_read_lock_held()); > + > return (struct mem_cgroup *)(memcg_data & ~OBJEXTS_FLAGS_MASK); > } > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 622d4544edd24..3da0284573857 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2366,7 +2366,7 @@ void mem_cgroup_cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages) > > static void commit_charge(struct folio *folio, struct mem_cgroup *memcg) > { > - VM_BUG_ON_FOLIO(folio_memcg(folio), folio); > + VM_BUG_ON_FOLIO(folio_memcg_charged(folio), folio); > /* > * Any of the following ensures page's memcg stability: > * > @@ -4617,7 +4617,7 @@ void __mem_cgroup_uncharge(struct folio *folio) > struct uncharge_gather ug; > > /* Don't touch folio->lru of any random page, pre-check: */ > - if (!folio_memcg(folio)) > + if (!folio_memcg_charged(folio)) > return; > > uncharge_gather_clear(&ug); > @@ -4662,7 +4662,7 @@ void mem_cgroup_replace_folio(struct folio *old, struct folio *new) > return; > > /* Page cache replacement: new folio already charged? */ > - if (folio_memcg(new)) > + if (folio_memcg_charged(new)) > return; > > memcg = folio_memcg(old);
> On Jul 22, 2024, at 15:46, Chengming Zhou <chengming.zhou@linux.dev> wrote: > > On 2024/7/22 15:08, Muchun Song wrote: >> The obj_cgroup_memcg() is supposed to safe to prevent the returned >> memory cgroup from being freed only when the caller is holding the >> rcu read lock or objcg_lock or cgroup_mutex. It is very easy to >> ignore thoes conditions when users call some upper APIs which call >> obj_cgroup_memcg() internally like mem_cgroup_from_slab_obj() (See >> the link below). So it is better to add lockdep assertion to >> obj_cgroup_memcg() to find those issues ASAP. > > Yeah, some users care about the lifetime of returned memcg, while > some other users maybe not. > > Maybe a dumb question, can we just make objcg hold the refcount of > its pointed memcg? So the users of that objcg don't need to care > about the refcount of memcg? (We could switch the refcount from > old memcg to the new memcg when objcg switch memcg pointer, right?) You mean the memcg is pinned if objcg is pinned, right? If yes, in which case, reparenting of memcg cannot make memcg being freed ASAP. > > Thanks. > >> Because there is no user of obj_cgroup_memcg() holding objcg_lock >> to make the returned memory cgroup safe, do not add objcg_lock >> assertion (We should export objcg_lock if we really want to do) >> and leave a comment to indicate it is intentional. >> Some users like __mem_cgroup_uncharge() do not care the lifetime >> of the returned memory cgroup, which just want to know if the >> folio is charged to a memory cgroup, therefore, they do not need >> to hold the needed locks. In which case, introduce a new helper >> folio_memcg_charged() to do this. Compare it to folio_memcg(), it >> could eliminate a memory access of objcg->memcg for kmem, actually, >> a really small gain. >> Link: https://lore.kernel.org/all/20240718083607.42068-1-songmuchun@bytedance.com/ >> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >> --- >> include/linux/memcontrol.h | 22 +++++++++++++++++++--- >> mm/memcontrol.c | 6 +++--- >> 2 files changed, 22 insertions(+), 6 deletions(-) >> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h >> index fc94879db4dff..d616c50025098 100644 >> --- a/include/linux/memcontrol.h >> +++ b/include/linux/memcontrol.h >> @@ -360,11 +360,13 @@ static inline bool folio_memcg_kmem(struct folio *folio); >> * After the initialization objcg->memcg is always pointing at >> * a valid memcg, but can be atomically swapped to the parent memcg. >> * >> - * The caller must ensure that the returned memcg won't be released: >> - * e.g. acquire the rcu_read_lock or css_set_lock. >> + * The caller must ensure that the returned memcg won't be released. >> */ >> static inline struct mem_cgroup *obj_cgroup_memcg(struct obj_cgroup *objcg) >> { >> + WARN_ON_ONCE(!rcu_read_lock_held() && >> + /* !lockdep_is_held(&objcg_lock) && */ >> + !lockdep_is_held(&cgroup_mutex)); >> return READ_ONCE(objcg->memcg); >> } >> @@ -438,6 +440,19 @@ static inline struct mem_cgroup *folio_memcg(struct folio *folio) >> return __folio_memcg(folio); >> } >> +/* >> + * folio_memcg_charged - If a folio is charged to a memory cgroup. >> + * @folio: Pointer to the folio. >> + * >> + * Returns true if folio is charged to a memory cgroup, otherwise returns false. >> + */ >> +static inline bool folio_memcg_charged(struct folio *folio) >> +{ >> + if (folio_memcg_kmem(folio)) >> + return __folio_objcg(folio) != NULL; >> + return __folio_memcg(folio) != NULL; >> +} >> + >> /** >> * folio_memcg_rcu - Locklessly get the memory cgroup associated with a folio. >> * @folio: Pointer to the folio. >> @@ -454,7 +469,6 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio) >> unsigned long memcg_data = READ_ONCE(folio->memcg_data); >> VM_BUG_ON_FOLIO(folio_test_slab(folio), folio); >> - WARN_ON_ONCE(!rcu_read_lock_held()); >> if (memcg_data & MEMCG_DATA_KMEM) { >> struct obj_cgroup *objcg; >> @@ -463,6 +477,8 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio) >> return obj_cgroup_memcg(objcg); >> } >> + WARN_ON_ONCE(!rcu_read_lock_held()); >> + >> return (struct mem_cgroup *)(memcg_data & ~OBJEXTS_FLAGS_MASK); >> } >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> index 622d4544edd24..3da0284573857 100644 >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c >> @@ -2366,7 +2366,7 @@ void mem_cgroup_cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages) >> static void commit_charge(struct folio *folio, struct mem_cgroup *memcg) >> { >> - VM_BUG_ON_FOLIO(folio_memcg(folio), folio); >> + VM_BUG_ON_FOLIO(folio_memcg_charged(folio), folio); >> /* >> * Any of the following ensures page's memcg stability: >> * >> @@ -4617,7 +4617,7 @@ void __mem_cgroup_uncharge(struct folio *folio) >> struct uncharge_gather ug; >> /* Don't touch folio->lru of any random page, pre-check: */ >> - if (!folio_memcg(folio)) >> + if (!folio_memcg_charged(folio)) >> return; >> uncharge_gather_clear(&ug); >> @@ -4662,7 +4662,7 @@ void mem_cgroup_replace_folio(struct folio *old, struct folio *new) >> return; >> /* Page cache replacement: new folio already charged? */ >> - if (folio_memcg(new)) >> + if (folio_memcg_charged(new)) >> return; >> memcg = folio_memcg(old);
On Mon, Jul 22, 2024 at 03:08:10PM GMT, Muchun Song wrote: > The obj_cgroup_memcg() is supposed to safe to prevent the returned > memory cgroup from being freed only when the caller is holding the > rcu read lock or objcg_lock or cgroup_mutex. It is very easy to > ignore thoes conditions when users call some upper APIs which call > obj_cgroup_memcg() internally like mem_cgroup_from_slab_obj() (See > the link below). So it is better to add lockdep assertion to > obj_cgroup_memcg() to find those issues ASAP. > > Because there is no user of obj_cgroup_memcg() holding objcg_lock > to make the returned memory cgroup safe, do not add objcg_lock > assertion (We should export objcg_lock if we really want to do) > and leave a comment to indicate it is intentional. > Do we expect non-memcg code to access objcg_lock? To me this is some internal implementation detail of memcg and should not be accessible outside memcg code. So, I would recommend to not mention objcg_lock at all. > Some users like __mem_cgroup_uncharge() do not care the lifetime > of the returned memory cgroup, which just want to know if the > folio is charged to a memory cgroup, therefore, they do not need > to hold the needed locks. In which case, introduce a new helper > folio_memcg_charged() to do this. Compare it to folio_memcg(), it > could eliminate a memory access of objcg->memcg for kmem, actually, > a really small gain. > > Link: https://lore.kernel.org/all/20240718083607.42068-1-songmuchun@bytedance.com/ > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > --- > include/linux/memcontrol.h | 22 +++++++++++++++++++--- > mm/memcontrol.c | 6 +++--- > 2 files changed, 22 insertions(+), 6 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index fc94879db4dff..d616c50025098 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -360,11 +360,13 @@ static inline bool folio_memcg_kmem(struct folio *folio); > * After the initialization objcg->memcg is always pointing at > * a valid memcg, but can be atomically swapped to the parent memcg. > * > - * The caller must ensure that the returned memcg won't be released: > - * e.g. acquire the rcu_read_lock or css_set_lock. > + * The caller must ensure that the returned memcg won't be released. > */ > static inline struct mem_cgroup *obj_cgroup_memcg(struct obj_cgroup *objcg) > { > + WARN_ON_ONCE(!rcu_read_lock_held() && > + /* !lockdep_is_held(&objcg_lock) && */ > + !lockdep_is_held(&cgroup_mutex)); > return READ_ONCE(objcg->memcg); > } > > @@ -438,6 +440,19 @@ static inline struct mem_cgroup *folio_memcg(struct folio *folio) > return __folio_memcg(folio); > } > > +/* > + * folio_memcg_charged - If a folio is charged to a memory cgroup. > + * @folio: Pointer to the folio. > + * > + * Returns true if folio is charged to a memory cgroup, otherwise returns false. > + */ > +static inline bool folio_memcg_charged(struct folio *folio) > +{ > + if (folio_memcg_kmem(folio)) > + return __folio_objcg(folio) != NULL; > + return __folio_memcg(folio) != NULL; > +} > + > /** > * folio_memcg_rcu - Locklessly get the memory cgroup associated with a folio. > * @folio: Pointer to the folio. > @@ -454,7 +469,6 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio) > unsigned long memcg_data = READ_ONCE(folio->memcg_data); > > VM_BUG_ON_FOLIO(folio_test_slab(folio), folio); > - WARN_ON_ONCE(!rcu_read_lock_held()); > > if (memcg_data & MEMCG_DATA_KMEM) { > struct obj_cgroup *objcg; > @@ -463,6 +477,8 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio) > return obj_cgroup_memcg(objcg); > } > > + WARN_ON_ONCE(!rcu_read_lock_held()); > + > return (struct mem_cgroup *)(memcg_data & ~OBJEXTS_FLAGS_MASK); > } > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 622d4544edd24..3da0284573857 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2366,7 +2366,7 @@ void mem_cgroup_cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages) > > static void commit_charge(struct folio *folio, struct mem_cgroup *memcg) > { > - VM_BUG_ON_FOLIO(folio_memcg(folio), folio); > + VM_BUG_ON_FOLIO(folio_memcg_charged(folio), folio); > /* > * Any of the following ensures page's memcg stability: > * > @@ -4617,7 +4617,7 @@ void __mem_cgroup_uncharge(struct folio *folio) > struct uncharge_gather ug; > > /* Don't touch folio->lru of any random page, pre-check: */ > - if (!folio_memcg(folio)) > + if (!folio_memcg_charged(folio)) > return; > > uncharge_gather_clear(&ug); > @@ -4662,7 +4662,7 @@ void mem_cgroup_replace_folio(struct folio *old, struct folio *new) > return; > > /* Page cache replacement: new folio already charged? */ > - if (folio_memcg(new)) > + if (folio_memcg_charged(new)) > return; > > memcg = folio_memcg(old); > -- > 2.20.1 >
> On Jul 24, 2024, at 02:39, Shakeel Butt <shakeel.butt@linux.dev> wrote: > > On Mon, Jul 22, 2024 at 03:08:10PM GMT, Muchun Song wrote: >> The obj_cgroup_memcg() is supposed to safe to prevent the returned >> memory cgroup from being freed only when the caller is holding the >> rcu read lock or objcg_lock or cgroup_mutex. It is very easy to >> ignore thoes conditions when users call some upper APIs which call >> obj_cgroup_memcg() internally like mem_cgroup_from_slab_obj() (See >> the link below). So it is better to add lockdep assertion to >> obj_cgroup_memcg() to find those issues ASAP. >> >> Because there is no user of obj_cgroup_memcg() holding objcg_lock >> to make the returned memory cgroup safe, do not add objcg_lock >> assertion (We should export objcg_lock if we really want to do) >> and leave a comment to indicate it is intentional. >> > > Do we expect non-memcg code to access objcg_lock? To me this is some > internal implementation detail of memcg and should not be accessible > outside memcg code. So, I would recommend to not mention objcg_lock at > all. Also make sense. Will update next version.
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index fc94879db4dff..d616c50025098 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -360,11 +360,13 @@ static inline bool folio_memcg_kmem(struct folio *folio); * After the initialization objcg->memcg is always pointing at * a valid memcg, but can be atomically swapped to the parent memcg. * - * The caller must ensure that the returned memcg won't be released: - * e.g. acquire the rcu_read_lock or css_set_lock. + * The caller must ensure that the returned memcg won't be released. */ static inline struct mem_cgroup *obj_cgroup_memcg(struct obj_cgroup *objcg) { + WARN_ON_ONCE(!rcu_read_lock_held() && + /* !lockdep_is_held(&objcg_lock) && */ + !lockdep_is_held(&cgroup_mutex)); return READ_ONCE(objcg->memcg); } @@ -438,6 +440,19 @@ static inline struct mem_cgroup *folio_memcg(struct folio *folio) return __folio_memcg(folio); } +/* + * folio_memcg_charged - If a folio is charged to a memory cgroup. + * @folio: Pointer to the folio. + * + * Returns true if folio is charged to a memory cgroup, otherwise returns false. + */ +static inline bool folio_memcg_charged(struct folio *folio) +{ + if (folio_memcg_kmem(folio)) + return __folio_objcg(folio) != NULL; + return __folio_memcg(folio) != NULL; +} + /** * folio_memcg_rcu - Locklessly get the memory cgroup associated with a folio. * @folio: Pointer to the folio. @@ -454,7 +469,6 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio) unsigned long memcg_data = READ_ONCE(folio->memcg_data); VM_BUG_ON_FOLIO(folio_test_slab(folio), folio); - WARN_ON_ONCE(!rcu_read_lock_held()); if (memcg_data & MEMCG_DATA_KMEM) { struct obj_cgroup *objcg; @@ -463,6 +477,8 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio) return obj_cgroup_memcg(objcg); } + WARN_ON_ONCE(!rcu_read_lock_held()); + return (struct mem_cgroup *)(memcg_data & ~OBJEXTS_FLAGS_MASK); } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 622d4544edd24..3da0284573857 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2366,7 +2366,7 @@ void mem_cgroup_cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages) static void commit_charge(struct folio *folio, struct mem_cgroup *memcg) { - VM_BUG_ON_FOLIO(folio_memcg(folio), folio); + VM_BUG_ON_FOLIO(folio_memcg_charged(folio), folio); /* * Any of the following ensures page's memcg stability: * @@ -4617,7 +4617,7 @@ void __mem_cgroup_uncharge(struct folio *folio) struct uncharge_gather ug; /* Don't touch folio->lru of any random page, pre-check: */ - if (!folio_memcg(folio)) + if (!folio_memcg_charged(folio)) return; uncharge_gather_clear(&ug); @@ -4662,7 +4662,7 @@ void mem_cgroup_replace_folio(struct folio *old, struct folio *new) return; /* Page cache replacement: new folio already charged? */ - if (folio_memcg(new)) + if (folio_memcg_charged(new)) return; memcg = folio_memcg(old);
The obj_cgroup_memcg() is supposed to safe to prevent the returned memory cgroup from being freed only when the caller is holding the rcu read lock or objcg_lock or cgroup_mutex. It is very easy to ignore thoes conditions when users call some upper APIs which call obj_cgroup_memcg() internally like mem_cgroup_from_slab_obj() (See the link below). So it is better to add lockdep assertion to obj_cgroup_memcg() to find those issues ASAP. Because there is no user of obj_cgroup_memcg() holding objcg_lock to make the returned memory cgroup safe, do not add objcg_lock assertion (We should export objcg_lock if we really want to do) and leave a comment to indicate it is intentional. Some users like __mem_cgroup_uncharge() do not care the lifetime of the returned memory cgroup, which just want to know if the folio is charged to a memory cgroup, therefore, they do not need to hold the needed locks. In which case, introduce a new helper folio_memcg_charged() to do this. Compare it to folio_memcg(), it could eliminate a memory access of objcg->memcg for kmem, actually, a really small gain. Link: https://lore.kernel.org/all/20240718083607.42068-1-songmuchun@bytedance.com/ Signed-off-by: Muchun Song <songmuchun@bytedance.com> --- include/linux/memcontrol.h | 22 +++++++++++++++++++--- mm/memcontrol.c | 6 +++--- 2 files changed, 22 insertions(+), 6 deletions(-)