diff mbox

[11/51] memcg: implement mem_cgroup_css_from_page()

Message ID 1432329245-5844-12-git-send-email-tj@kernel.org (mailing list archive)
State New, archived
Headers show

Commit Message

Tejun Heo May 22, 2015, 9:13 p.m. UTC
Implement mem_cgroup_css_from_page() which returns the
cgroup_subsys_state of the memcg associated with a given page.  This
will be used by cgroup writeback support.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
---
 include/linux/memcontrol.h |  1 +
 mm/memcontrol.c            | 14 ++++++++++++++
 2 files changed, 15 insertions(+)

Comments

Johannes Weiner May 22, 2015, 11:28 p.m. UTC | #1
On Fri, May 22, 2015 at 05:13:25PM -0400, Tejun Heo wrote:
> +/**
> + * mem_cgroup_css_from_page - css of the memcg associated with a page
> + * @page: page of interest
> + *
> + * This function is guaranteed to return a valid cgroup_subsys_state and
> + * the returned css remains accessible until @page is released.
> + */
> +struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page)
> +{
> +	if (page->mem_cgroup)
> +		return &page->mem_cgroup->css;
> +	return &root_mem_cgroup->css;
> +}

replace_page_cache() can clear page->mem_cgroup even when the page
still has references, so unfortunately you must hold the page lock
when calling this function.

I haven't checked how you use this - chances are you always have the
page locked anyways - but it probably needs a comment.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo May 24, 2015, 9:24 p.m. UTC | #2
Hello,

On Fri, May 22, 2015 at 07:28:31PM -0400, Johannes Weiner wrote:
> replace_page_cache() can clear page->mem_cgroup even when the page
> still has references, so unfortunately you must hold the page lock
> when calling this function.
> 
> I haven't checked how you use this - chances are you always have the
> page locked anyways - but it probably needs a comment.

Hmmm... as replace_page_cache_page() is used only by fuse and fuse's
bdi doesn't go through the usual writeback accounting which is
necessary for cgroup writeback support anyway, so I don't think this
presents an actual problem.  I'll add a warning in
replace_page_cache_page() which triggers when it's used on a bdi which
has cgroup writeback enabled and add comments explaining what's going
on.

Thanks.
Johannes Weiner May 27, 2015, 12:58 p.m. UTC | #3
On Sun, May 24, 2015 at 05:24:40PM -0400, Tejun Heo wrote:
> Hello,
> 
> On Fri, May 22, 2015 at 07:28:31PM -0400, Johannes Weiner wrote:
> > replace_page_cache() can clear page->mem_cgroup even when the page
> > still has references, so unfortunately you must hold the page lock
> > when calling this function.
> > 
> > I haven't checked how you use this - chances are you always have the
> > page locked anyways - but it probably needs a comment.
> 
> Hmmm... as replace_page_cache_page() is used only by fuse and fuse's
> bdi doesn't go through the usual writeback accounting which is
> necessary for cgroup writeback support anyway, so I don't think this
> presents an actual problem.  I'll add a warning in
> replace_page_cache_page() which triggers when it's used on a bdi which
> has cgroup writeback enabled and add comments explaining what's going
> on.

Okay, so that's no problem then as long as it's documented.

In the long term, it would probably still be a good idea to restore
the invariant that page->mem_cgroup never changes on live pages.  For
the old interface that ship has sailed as live pages can move around
different cgroups; in unified hierarchy, however, we currently only
move charges when migrating pages between page frames.  That can be
switched to duplicating the charge instead and leaving the old page
alone until the final put - which is expected to occur soon after.

Accounting the same page twice for a short period during migration
should be an acceptable trade-off when considering how much simpler it
makes the synchronization rules.  We only have to make sure to clearly
mark interfaces and functions that are only safe for use with unified
hierarchy code.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 294498f..637ef62 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -115,6 +115,7 @@  static inline bool mm_match_cgroup(struct mm_struct *mm,
 }
 
 extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg);
+extern struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page);
 
 struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *,
 				   struct mem_cgroup *,
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b22a92b..763f8f3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -598,6 +598,20 @@  struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg)
 	return &memcg->css;
 }
 
+/**
+ * mem_cgroup_css_from_page - css of the memcg associated with a page
+ * @page: page of interest
+ *
+ * This function is guaranteed to return a valid cgroup_subsys_state and
+ * the returned css remains accessible until @page is released.
+ */
+struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page)
+{
+	if (page->mem_cgroup)
+		return &page->mem_cgroup->css;
+	return &root_mem_cgroup->css;
+}
+
 static struct mem_cgroup_per_zone *
 mem_cgroup_page_zoneinfo(struct mem_cgroup *memcg, struct page *page)
 {