[RFC] memcg: Add swappiness to cgroup2
diff mbox series

Message ID 1577252208-32419-1-git-send-email-teawater@gmail.com
State New
Headers show
Series
  • [RFC] memcg: Add swappiness to cgroup2
Related show

Commit Message

Hui Zhu Dec. 25, 2019, 5:36 a.m. UTC
Even if cgroup2 has swap.max, swappiness is still a very useful config.
This commit add swappiness to cgroup2.

Signed-off-by: Hui Zhu <teawaterz@linux.alibaba.com>
---
 mm/memcontrol.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Chris Down Dec. 25, 2019, 2:05 p.m. UTC | #1
Hi Hui,

Hui Zhu writes:
>Even if cgroup2 has swap.max, swappiness is still a very useful config.
>This commit add swappiness to cgroup2.

When submitting patches like this, it's important to explain *why* you want it 
and what evidence there is. For example, how should one use this to compose a 
reasonable system? Why aren't existing protection controls sufficient for your 
use case? Where's the data?

Also, why would swappiness be something cgroup-specific instead of 
hardware-specific, when desired swappiness is really largely about the hardware 
you have in your system?

I struggle to think of situations where per-cgroup swappiness would be useful, 
since it's really not a workload-specific setting.

Thanks,

Chris

>Signed-off-by: Hui Zhu <teawaterz@linux.alibaba.com>
>---
> mm/memcontrol.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
>diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>index c5b5f74..e966396 100644
>--- a/mm/memcontrol.c
>+++ b/mm/memcontrol.c
>@@ -7143,6 +7143,11 @@ static struct cftype swap_files[] = {
> 		.file_offset = offsetof(struct mem_cgroup, swap_events_file),
> 		.seq_show = swap_events_show,
> 	},
>+	{
>+		.name = "swappiness",
>+		.read_u64 = mem_cgroup_swappiness_read,
>+		.write_u64 = mem_cgroup_swappiness_write,
>+	},
> 	{ }	/* terminate */
> };
>
>-- 
>2.7.4
>
>
teawater Dec. 26, 2019, 6:56 a.m. UTC | #2
> 在 2019年12月25日,22:05,Chris Down <chris@chrisdown.name> 写道:
> 
> Hi Hui,
> 
> Hui Zhu writes:
>> Even if cgroup2 has swap.max, swappiness is still a very useful config.
>> This commit add swappiness to cgroup2.
> 
> When submitting patches like this, it's important to explain *why* you want it and what evidence there is. For example, how should one use this to compose a reasonable system? Why aren't existing protection controls sufficient for your use case? Where's the data?
> 
> Also, why would swappiness be something cgroup-specific instead of hardware-specific, when desired swappiness is really largely about the hardware you have in your system?
> 
> I struggle to think of situations where per-cgroup swappiness would be useful, since it's really not a workload-specific setting.


Hi Chris,

My thought about per-cgroup swappiness is different applications should have different memory footprint.
For example, an application does a lot of file access work in a memory-constrained environment.  Its performance depend on the file access speed.  Keep more file cache will good for it.  Then more swappiness will good for it, especially with the high speed swap device(zram/zswap).
And in the same environment, an application that access anon memory a lot of times.  Use low swapiness will good for its performance.  But just let it not to swap is not a good for it because the code is inside file cache.  Just drop the file cache will decrease the application speed sometime.
Both of them are extreme examples.  Other applications maybe access both file and anon.  Maybe define a special swapiness is good for it.

This is what I thought about add swappiness to cgroup2.

Best,
Hui

> 
> Thanks,
> 
> Chris
> 
>> Signed-off-by: Hui Zhu <teawaterz@linux.alibaba.com>
>> ---
>> mm/memcontrol.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>> 
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index c5b5f74..e966396 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -7143,6 +7143,11 @@ static struct cftype swap_files[] = {
>> 		.file_offset = offsetof(struct mem_cgroup, swap_events_file),
>> 		.seq_show = swap_events_show,
>> 	},
>> +	{
>> +		.name = "swappiness",
>> +		.read_u64 = mem_cgroup_swappiness_read,
>> +		.write_u64 = mem_cgroup_swappiness_write,
>> +	},
>> 	{ }	/* terminate */
>> };
>> 
>> -- 
>> 2.7.4
>>
Andrew Morton Dec. 31, 2019, 11:16 p.m. UTC | #3
On Wed, 25 Dec 2019 13:36:48 +0800 Hui Zhu <teawater@gmail.com> wrote:

> Even if cgroup2 has swap.max, swappiness is still a very useful config.
> This commit add swappiness to cgroup2.
> 
> ...
>
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -7143,6 +7143,11 @@ static struct cftype swap_files[] = {
>  		.file_offset = offsetof(struct mem_cgroup, swap_events_file),
>  		.seq_show = swap_events_show,
>  	},
> +	{
> +		.name = "swappiness",
> +		.read_u64 = mem_cgroup_swappiness_read,
> +		.write_u64 = mem_cgroup_swappiness_write,
> +	},
>  	{ }	/* terminate */
>  };

There should be a Documentation/ update with this?
teawater Jan. 2, 2020, 1:44 a.m. UTC | #4
> 在 2020年1月1日,07:16,Andrew Morton <akpm@linux-foundation.org> 写道:
> 
> On Wed, 25 Dec 2019 13:36:48 +0800 Hui Zhu <teawater@gmail.com> wrote:
> 
>> Even if cgroup2 has swap.max, swappiness is still a very useful config.
>> This commit add swappiness to cgroup2.
>> 
>> ...
>> 
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -7143,6 +7143,11 @@ static struct cftype swap_files[] = {
>> 		.file_offset = offsetof(struct mem_cgroup, swap_events_file),
>> 		.seq_show = swap_events_show,
>> 	},
>> +	{
>> +		.name = "swappiness",
>> +		.read_u64 = mem_cgroup_swappiness_read,
>> +		.write_u64 = mem_cgroup_swappiness_write,
>> +	},
>> 	{ }	/* terminate */
>> };
> 
> There should be a Documentation/ update with this?
> 

Hi Andrew,

I will post a new version with Documentation/ update.

Thanks,
Hui

> 
>
Michal Koutný Jan. 3, 2020, 9:48 a.m. UTC | #5
On Thu, Dec 26, 2019 at 02:56:40PM +0800, teawater <teawaterz@linux.alibaba.com> wrote:
> For example, an application does a lot of file access work in a
> memory-constrained environment.
> [...]
> Both of them are extreme examples.
The examples are quite generic. Do cgroup v2 controls really prevent
handling such workloads appropriately?

Besides that, note that per-cgroup swappiness as used in v1 cannot be
simply transferred into v2 because, it's a concept that doesn't take
into account cgroup hierarchies (how would parent's swappiness affect
children? what would swappiness on inner nodes mean?).

HTH,
Michal
Michal Hocko Jan. 6, 2020, 1:10 p.m. UTC | #6
On Wed 25-12-19 14:05:46, Chris Down wrote:
> Hi Hui,
> 
> Hui Zhu writes:
> > Even if cgroup2 has swap.max, swappiness is still a very useful config.
> > This commit add swappiness to cgroup2.
> 
> When submitting patches like this, it's important to explain *why* you want
> it and what evidence there is. For example, how should one use this to
> compose a reasonable system? Why aren't existing protection controls
> sufficient for your use case? Where's the data?

Agreed!

> Also, why would swappiness be something cgroup-specific instead of
> hardware-specific, when desired swappiness is really largely about the
> hardware you have in your system?

I am not really sure I agree here though. Swappiness has been
traditionally more about workload because it has been believed that it
is a preference of the workload whether the anonymous or disk based
memory is more important. Whether this is a good interface is debatable
of course but time has shown that it is extremely hard to tune.

Not to mention that swappiness has been ignored for years for vast
majority workloads because of the highly biased file LRU reclaim.

At the time when cgroup v2 was introduced it'd been claimed that we
do not want to copy the v1 swappiness logic because of the semantic
shortcomings and that a better tuning should developed in future
replacing even the global knob. AFAIR Johannes wanted to have a refault
vs. cost based file/anon balancing.

The lack of a sensible hierarchical behavior has been even a stronger
argument.
Chris Down Jan. 6, 2020, 1:24 p.m. UTC | #7
Michal Hocko writes:
>I am not really sure I agree here though. Swappiness has been
>traditionally more about workload because it has been believed that it
>is a preference of the workload whether the anonymous or disk based
>memory is more important. Whether this is a good interface is debatable
>of course but time has shown that it is extremely hard to tune.

Sure, it can theoretically be hardware- and workload-specific -- I don't think 
we disagree here. The reason I suggest it's a generally hardware-specific 
tunable rather than a workload-specific tunable is it's pretty rare to see 
anyone who's meaningfully used it for workload-specific tuning :-)

Patch
diff mbox series

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c5b5f74..e966396 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -7143,6 +7143,11 @@  static struct cftype swap_files[] = {
 		.file_offset = offsetof(struct mem_cgroup, swap_events_file),
 		.seq_show = swap_events_show,
 	},
+	{
+		.name = "swappiness",
+		.read_u64 = mem_cgroup_swappiness_read,
+		.write_u64 = mem_cgroup_swappiness_write,
+	},
 	{ }	/* terminate */
 };