diff mbox

[2/2] mm: don't skip memory guarantee calculations

Message ID 20180522132528.23769-2-guro@fb.com (mailing list archive)
State New, archived
Headers show

Commit Message

Roman Gushchin May 22, 2018, 1:25 p.m. UTC
There are two cases when effective memory guarantee calculation
is mistakenly skipped:

1) If memcg is a child of the root cgroup, and the root
cgroup is not root_mem_cgroup (in other words, if the reclaim
is targeted). Top-level memory cgroups are handled specially
in mem_cgroup_protected(), because the root memory cgroup doesn't
have memory guarantee and can't limit its children guarantees.
So, all effective guarantee calculation is skipped.
But in case of targeted reclaim things are different:
cgroups, which parent exceeded its memory limit aren't special.

2) If memcg has no charged memory (memory usage is 0). In this
case mem_cgroup_protected() always returns MEMCG_PROT_NONE, which
is correct and prevents to generate fake memory low events for
empty cgroups. But skipping memory emin/elow calculation is wrong:
if there is no global memory pressure there might be no good
chance again, so we can end up with effective guarantees set to 0
without any reason.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 mm/memcontrol.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

Comments

Michal Hocko June 4, 2018, 12:29 p.m. UTC | #1
On Tue 22-05-18 14:25:28, Roman Gushchin wrote:
> There are two cases when effective memory guarantee calculation
> is mistakenly skipped:
> 
> 1) If memcg is a child of the root cgroup, and the root
> cgroup is not root_mem_cgroup (in other words, if the reclaim
> is targeted). Top-level memory cgroups are handled specially
> in mem_cgroup_protected(), because the root memory cgroup doesn't
> have memory guarantee and can't limit its children guarantees.
> So, all effective guarantee calculation is skipped.
> But in case of targeted reclaim things are different:
> cgroups, which parent exceeded its memory limit aren't special.
> 
> 2) If memcg has no charged memory (memory usage is 0). In this
> case mem_cgroup_protected() always returns MEMCG_PROT_NONE, which
> is correct and prevents to generate fake memory low events for
> empty cgroups. But skipping memory emin/elow calculation is wrong:
> if there is no global memory pressure there might be no good
> chance again, so we can end up with effective guarantees set to 0
> without any reason.

Roman, so these two patches are on top of the min limit patches, right?
The fact that they come after just makes me feel this whole thing is not
completely thought through and I would like to see all 4 patch in one
series describing the whole design. We are getting really close to the
merge window and last minute updates makes me really nervouse. Can you
please repost the whole thing after the merge window, please?

As I've said earlier I am not even sure we really want to have a hard
guarantee once we decided to go with low limit. So a very good reasoning
should be added for the whole thing.

Thanks!
Roman Gushchin June 4, 2018, 4:23 p.m. UTC | #2
On Mon, Jun 04, 2018 at 02:29:53PM +0200, Michal Hocko wrote:
> On Tue 22-05-18 14:25:28, Roman Gushchin wrote:
> > There are two cases when effective memory guarantee calculation
> > is mistakenly skipped:
> > 
> > 1) If memcg is a child of the root cgroup, and the root
> > cgroup is not root_mem_cgroup (in other words, if the reclaim
> > is targeted). Top-level memory cgroups are handled specially
> > in mem_cgroup_protected(), because the root memory cgroup doesn't
> > have memory guarantee and can't limit its children guarantees.
> > So, all effective guarantee calculation is skipped.
> > But in case of targeted reclaim things are different:
> > cgroups, which parent exceeded its memory limit aren't special.
> > 
> > 2) If memcg has no charged memory (memory usage is 0). In this
> > case mem_cgroup_protected() always returns MEMCG_PROT_NONE, which
> > is correct and prevents to generate fake memory low events for
> > empty cgroups. But skipping memory emin/elow calculation is wrong:
> > if there is no global memory pressure there might be no good
> > chance again, so we can end up with effective guarantees set to 0
> > without any reason.
> 
> Roman, so these two patches are on top of the min limit patches, right?
> The fact that they come after just makes me feel this whole thing is not
> completely thought through and I would like to see all 4 patch in one
> series describing the whole design. We are getting really close to the
> merge window and last minute updates makes me really nervouse. Can you
> please repost the whole thing after the merge window, please?

Hi, Michal!

These changes are fixing some edge cases which I've discovered
when I started writing unit tests for the memory controller
(see in tools/testing/selftesting/cgroup/). All these edge cases
are temporarily effects which exist only when there is no
global memory pressure.

We're already using my implementation in production for some time,
and so far had no issues with it.

Please note, that the existing implementation of memory.low has much more
serious problems: it barely works without some significant configuration
tweaks (e.g. set all memory.low in the hierarchy to max, except leaves),
which are painful in production.

I'm happy to discuss any concrete issues/concerns, but I really see
no reasons to drop it from the mm tree now and start the discussion
from scratch.

Thank you!
Michal Hocko June 5, 2018, 9:03 a.m. UTC | #3
On Mon 04-06-18 17:23:06, Roman Gushchin wrote:
[...]
> I'm happy to discuss any concrete issues/concerns, but I really see
> no reasons to drop it from the mm tree now and start the discussion
> from scratch.

I do not think this is ready for the current merge window. Sorry! I
would really prefer to see the whole thing in one series to have a
better picture.
Roman Gushchin June 5, 2018, 10:15 a.m. UTC | #4
On Tue, Jun 05, 2018 at 11:03:49AM +0200, Michal Hocko wrote:
> On Mon 04-06-18 17:23:06, Roman Gushchin wrote:
> [...]
> > I'm happy to discuss any concrete issues/concerns, but I really see
> > no reasons to drop it from the mm tree now and start the discussion
> > from scratch.
> 
> I do not think this is ready for the current merge window. Sorry! I
> would really prefer to see the whole thing in one series to have a
> better picture.

Please, provide any specific reason for that. I appreciate your opinion,
but *I think* it's not an argument, seriously.

We've discussed the patchset back to March and I made several iterations
based on the received feedback. Later we had a separate discussion with Greg,
who proposed an alternative solution, which, unfortunately, had some serious
shortcomings. And, as I remember, some time ago we've discussed memory.min
with you.
And now you want to start from scratch without providing any reason.
I find it counter-productive, sorry.

Thanks!
Michal Hocko June 5, 2018, 10:28 a.m. UTC | #5
On Tue 05-06-18 11:15:45, Roman Gushchin wrote:
> On Tue, Jun 05, 2018 at 11:03:49AM +0200, Michal Hocko wrote:
> > On Mon 04-06-18 17:23:06, Roman Gushchin wrote:
> > [...]
> > > I'm happy to discuss any concrete issues/concerns, but I really see
> > > no reasons to drop it from the mm tree now and start the discussion
> > > from scratch.
> > 
> > I do not think this is ready for the current merge window. Sorry! I
> > would really prefer to see the whole thing in one series to have a
> > better picture.
> 
> Please, provide any specific reason for that. I appreciate your opinion,
> but *I think* it's not an argument, seriously.

Seeing two follow up fixes close to the merge window just speaks for
itself. Besides that there is not need to rush this now.
 
> We've discussed the patchset back to March and I made several iterations
> based on the received feedback. Later we had a separate discussion with Greg,
> who proposed an alternative solution, which, unfortunately, had some serious
> shortcomings. And, as I remember, some time ago we've discussed memory.min
> with you.
> And now you want to start from scratch without providing any reason.
> I find it counter-productive, sorry.

I am sorry I couldn't give it more time, but this release cycle was even
crazier than usual.
diff mbox

Patch

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b9cd0bb63759..20c4f0a97d4c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5809,20 +5809,15 @@  enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
 	if (mem_cgroup_disabled())
 		return MEMCG_PROT_NONE;
 
-	if (!root)
-		root = root_mem_cgroup;
-	if (memcg == root)
+	if (memcg == root_mem_cgroup)
 		return MEMCG_PROT_NONE;
 
 	usage = page_counter_read(&memcg->memory);
-	if (!usage)
-		return MEMCG_PROT_NONE;
-
 	emin = memcg->memory.min;
 	elow = memcg->memory.low;
 
 	parent = parent_mem_cgroup(memcg);
-	if (parent == root)
+	if (parent == root_mem_cgroup)
 		goto exit;
 
 	parent_emin = READ_ONCE(parent->memory.emin);
@@ -5857,6 +5852,12 @@  enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
 	memcg->memory.emin = emin;
 	memcg->memory.elow = elow;
 
+	if (root && memcg == root)
+		return MEMCG_PROT_NONE;
+
+	if (!usage)
+		return MEMCG_PROT_NONE;
+
 	if (usage <= emin)
 		return MEMCG_PROT_MIN;
 	else if (usage <= elow)