diff mbox series

memcg: drop kmem.limit_in_bytes

Message ID 20230704115240.14672-1-mhocko@kernel.org (mailing list archive)
State New
Headers show
Series memcg: drop kmem.limit_in_bytes | expand

Commit Message

Michal Hocko July 4, 2023, 11:52 a.m. UTC
From: Michal Hocko <mhocko@suse.com>

kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been
deprecated since 58056f77502f ("memcg, kmem: further deprecate
kmem.limit_in_bytes") merged in 5.16. We haven't heard about any
serious users since then but it seems that the mere presence of the file
is causing more harm thatn good. We (SUSE) have had several bug reports
from customers where Docker based containers started to fail because a
write to kmem.limit_in_bytes has failed.

This was unexpected because runc code only expects ENOENT (kmem
disabled) or EBUSY (tasks already running within cgroup). So a new error
code was unexpected and the whole container startup failed. This has
been later addressed by
https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe70952423d380
so current Docker runtimes do not suffer from the problem anymore. There
are still older version of Docker in use and likely hard to get rid of
completely.

Address this by wiping out the file completely and effectively get back
to pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.

I would recommend backporting to stable trees which have picked up
58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes").

Cc: stable
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 Documentation/admin-guide/cgroup-v1/memory.rst |  2 --
 mm/memcontrol.c                                | 13 -------------
 2 files changed, 15 deletions(-)

Comments

Shakeel Butt July 5, 2023, 5:13 a.m. UTC | #1
On Tue, Jul 04, 2023 at 01:52:40PM +0200, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been
> deprecated since 58056f77502f ("memcg, kmem: further deprecate
> kmem.limit_in_bytes") merged in 5.16. We haven't heard about any
> serious users since then but it seems that the mere presence of the file
> is causing more harm thatn good. We (SUSE) have had several bug reports
> from customers where Docker based containers started to fail because a
> write to kmem.limit_in_bytes has failed.
> 
> This was unexpected because runc code only expects ENOENT (kmem
> disabled) or EBUSY (tasks already running within cgroup). So a new error
> code was unexpected and the whole container startup failed. This has
> been later addressed by
> https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe70952423d380
> so current Docker runtimes do not suffer from the problem anymore. There
> are still older version of Docker in use and likely hard to get rid of
> completely.
> 
> Address this by wiping out the file completely and effectively get back
> to pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> 
> I would recommend backporting to stable trees which have picked up
> 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes").
> 
> Cc: stable
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Shakeel Butt <shakeelb@google.com>
Johannes Weiner July 5, 2023, 1:44 p.m. UTC | #2
On Tue, Jul 04, 2023 at 01:52:40PM +0200, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been
> deprecated since 58056f77502f ("memcg, kmem: further deprecate
> kmem.limit_in_bytes") merged in 5.16. We haven't heard about any
> serious users since then but it seems that the mere presence of the file
> is causing more harm thatn good. We (SUSE) have had several bug reports
> from customers where Docker based containers started to fail because a
> write to kmem.limit_in_bytes has failed.
> 
> This was unexpected because runc code only expects ENOENT (kmem
> disabled) or EBUSY (tasks already running within cgroup). So a new error
> code was unexpected and the whole container startup failed. This has
> been later addressed by
> https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe70952423d380
> so current Docker runtimes do not suffer from the problem anymore. There
> are still older version of Docker in use and likely hard to get rid of
> completely.
> 
> Address this by wiping out the file completely and effectively get back
> to pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> 
> I would recommend backporting to stable trees which have picked up
> 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes").
> 
> Cc: stable
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  Documentation/admin-guide/cgroup-v1/memory.rst |  2 --
>  mm/memcontrol.c                                | 13 -------------
>  2 files changed, 15 deletions(-)
> 
> diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
> index 47d1d7d932a8..b92c71f39172 100644
> --- a/Documentation/admin-guide/cgroup-v1/memory.rst
> +++ b/Documentation/admin-guide/cgroup-v1/memory.rst
> @@ -92,8 +92,6 @@ Brief summary of control files.
>   memory.oom_control		     set/show oom controls.
>   memory.numa_stat		     show the number of memory usage per numa
>  				     node
> - memory.kmem.limit_in_bytes          This knob is deprecated and writing to
> -                                     it will return -ENOTSUPP.
>   memory.kmem.usage_in_bytes          show current kernel memory allocation
>   memory.kmem.failcnt                 show the number of kernel memory usage
>  				     hits limits
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 4b27e245a055..a0d3ed8d02e2 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3750,9 +3750,6 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
>  	case _MEMSWAP:
>  		counter = &memcg->memsw;
>  		break;
> -	case _KMEM:
> -		counter = &memcg->kmem;
> -		break;
>  	case _TCP:
>  		counter = &memcg->tcpmem;
>  		break;

This case is still needed for the remaining kmem files:

	{
		.name = "kmem.usage_in_bytes",
		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
		.read_u64 = mem_cgroup_read_u64,
	},
	{
		.name = "kmem.failcnt",
		.private = MEMFILE_PRIVATE(_KMEM, RES_FAILCNT),
		.write = mem_cgroup_reset,
		.read_u64 = mem_cgroup_read_u64,
	},
	{
		.name = "kmem.max_usage_in_bytes",
		.private = MEMFILE_PRIVATE(_KMEM, RES_MAX_USAGE),
		.write = mem_cgroup_reset,
		.read_u64 = mem_cgroup_read_u64,
	},

otherwise they BUG() when reading.

Without this hunk, the patch looks good to me.
Michal Hocko July 7, 2023, 7:07 a.m. UTC | #3
On Wed 05-07-23 09:44:34, Johannes Weiner wrote:
[...]
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 4b27e245a055..a0d3ed8d02e2 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -3750,9 +3750,6 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
> >  	case _MEMSWAP:
> >  		counter = &memcg->memsw;
> >  		break;
> > -	case _KMEM:
> > -		counter = &memcg->kmem;
> > -		break;
> >  	case _TCP:
> >  		counter = &memcg->tcpmem;
> >  		break;
> 
> This case is still needed for the remaining kmem files:
> 
> 	{
> 		.name = "kmem.usage_in_bytes",
> 		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
> 		.read_u64 = mem_cgroup_read_u64,
> 	},
> 	{
> 		.name = "kmem.failcnt",
> 		.private = MEMFILE_PRIVATE(_KMEM, RES_FAILCNT),
> 		.write = mem_cgroup_reset,
> 		.read_u64 = mem_cgroup_read_u64,
> 	},
> 	{
> 		.name = "kmem.max_usage_in_bytes",
> 		.private = MEMFILE_PRIVATE(_KMEM, RES_MAX_USAGE),
> 		.write = mem_cgroup_reset,
> 		.read_u64 = mem_cgroup_read_u64,
> 	},
> 
> otherwise they BUG() when reading.
> 
> Without this hunk, the patch looks good to me.

You are right! Thanks. The updated patch below. Andrew could you replace
by this version please?
--- 
From b1bbc63b6baff9c5aaf30393cec29112b3abca25 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Tue, 4 Jul 2023 13:38:11 +0200
Subject: [PATCH] memcg: drop kmem.limit_in_bytes

kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been
deprecated since 58056f77502f ("memcg, kmem: further deprecate
kmem.limit_in_bytes") merged in 5.16. We haven't heard about any
serious users since then but it seems that the mere presence of the file
is causing more harm thatn good. We (SUSE) have had several bug reports
from customers where Docker based containers started to fail because a
write to kmem.limit_in_bytes has failed.

This was unexpected because runc code only expects ENOENT (kmem
disabled) or EBUSY (tasks already running within cgroup). So a new error
code was unexpected and the whole container startup failed. This has
been later addressed by
https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe70952423d380
so current Docker runtimes do not suffer from the problem anymore. There
are still older version of Docker in use and likely hard to get rid of
completely.

Address this by wiping out the file completely and effectively get back
to pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.

I would recommend backporting to stable trees which have picked up
58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes").

Cc: stable
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 Documentation/admin-guide/cgroup-v1/memory.rst |  2 --
 mm/memcontrol.c                                | 10 ----------
 2 files changed, 12 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index fabaad3fd9c2..8d3afeede10e 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -92,8 +92,6 @@ Brief summary of control files.
  memory.oom_control		     set/show oom controls.
  memory.numa_stat		     show the number of memory usage per numa
 				     node
- memory.kmem.limit_in_bytes          This knob is deprecated and writing to
-                                     it will return -ENOTSUPP.
  memory.kmem.usage_in_bytes          show current kernel memory allocation
  memory.kmem.failcnt                 show the number of kernel memory usage
 				     hits limits
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e8ca4bdcb03c..ab99503c9ff2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3871,10 +3871,6 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
 		case _MEMSWAP:
 			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
 			break;
-		case _KMEM:
-			/* kmem.limit_in_bytes is deprecated. */
-			ret = -EOPNOTSUPP;
-			break;
 		case _TCP:
 			ret = memcg_update_tcp_max(memcg, nr_pages);
 			break;
@@ -5085,12 +5081,6 @@ static struct cftype mem_cgroup_legacy_files[] = {
 		.seq_show = memcg_numa_stat_show,
 	},
 #endif
-	{
-		.name = "kmem.limit_in_bytes",
-		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
-		.write = mem_cgroup_write,
-		.read_u64 = mem_cgroup_read_u64,
-	},
 	{
 		.name = "kmem.usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
Johannes Weiner July 7, 2023, 1:50 p.m. UTC | #4
On Fri, Jul 07, 2023 at 09:07:47AM +0200, Michal Hocko wrote:
> From b1bbc63b6baff9c5aaf30393cec29112b3abca25 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Tue, 4 Jul 2023 13:38:11 +0200
> Subject: [PATCH] memcg: drop kmem.limit_in_bytes
> 
> kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been
> deprecated since 58056f77502f ("memcg, kmem: further deprecate
> kmem.limit_in_bytes") merged in 5.16. We haven't heard about any
> serious users since then but it seems that the mere presence of the file
> is causing more harm thatn good. We (SUSE) have had several bug reports
> from customers where Docker based containers started to fail because a
> write to kmem.limit_in_bytes has failed.
> 
> This was unexpected because runc code only expects ENOENT (kmem
> disabled) or EBUSY (tasks already running within cgroup). So a new error
> code was unexpected and the whole container startup failed. This has
> been later addressed by
> https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe70952423d380
> so current Docker runtimes do not suffer from the problem anymore. There
> are still older version of Docker in use and likely hard to get rid of
> completely.
> 
> Address this by wiping out the file completely and effectively get back
> to pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> 
> I would recommend backporting to stable trees which have picked up
> 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes").
> 
> Cc: stable
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Roman Gushchin July 7, 2023, 4:40 p.m. UTC | #5
On Fri, Jul 07, 2023 at 09:07:47AM +0200, Michal Hocko wrote:
> On Wed 05-07-23 09:44:34, Johannes Weiner wrote:
> [...]
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index 4b27e245a055..a0d3ed8d02e2 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -3750,9 +3750,6 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
> > >  	case _MEMSWAP:
> > >  		counter = &memcg->memsw;
> > >  		break;
> > > -	case _KMEM:
> > > -		counter = &memcg->kmem;
> > > -		break;
> > >  	case _TCP:
> > >  		counter = &memcg->tcpmem;
> > >  		break;
> > 
> > This case is still needed for the remaining kmem files:
> > 
> > 	{
> > 		.name = "kmem.usage_in_bytes",
> > 		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
> > 		.read_u64 = mem_cgroup_read_u64,
> > 	},
> > 	{
> > 		.name = "kmem.failcnt",
> > 		.private = MEMFILE_PRIVATE(_KMEM, RES_FAILCNT),
> > 		.write = mem_cgroup_reset,
> > 		.read_u64 = mem_cgroup_read_u64,
> > 	},
> > 	{
> > 		.name = "kmem.max_usage_in_bytes",
> > 		.private = MEMFILE_PRIVATE(_KMEM, RES_MAX_USAGE),
> > 		.write = mem_cgroup_reset,
> > 		.read_u64 = mem_cgroup_read_u64,
> > 	},
> > 
> > otherwise they BUG() when reading.
> > 
> > Without this hunk, the patch looks good to me.
> 
> You are right! Thanks. The updated patch below. Andrew could you replace
> by this version please?
> --- 
> From b1bbc63b6baff9c5aaf30393cec29112b3abca25 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Tue, 4 Jul 2023 13:38:11 +0200
> Subject: [PATCH] memcg: drop kmem.limit_in_bytes
> 
> kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been
> deprecated since 58056f77502f ("memcg, kmem: further deprecate
> kmem.limit_in_bytes") merged in 5.16. We haven't heard about any
> serious users since then but it seems that the mere presence of the file
> is causing more harm thatn good. We (SUSE) have had several bug reports
> from customers where Docker based containers started to fail because a
> write to kmem.limit_in_bytes has failed.
> 
> This was unexpected because runc code only expects ENOENT (kmem
> disabled) or EBUSY (tasks already running within cgroup). So a new error
> code was unexpected and the whole container startup failed. This has
> been later addressed by
> https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe70952423d380
> so current Docker runtimes do not suffer from the problem anymore. There
> are still older version of Docker in use and likely hard to get rid of
> completely.
> 
> Address this by wiping out the file completely and effectively get back
> to pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> 
> I would recommend backporting to stable trees which have picked up
> 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes").
> 
> Cc: stable
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Roman Gushchin <roman.gushchin@linux.dev>

Thanks!
diff mbox series

Patch

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 47d1d7d932a8..b92c71f39172 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -92,8 +92,6 @@  Brief summary of control files.
  memory.oom_control		     set/show oom controls.
  memory.numa_stat		     show the number of memory usage per numa
 				     node
- memory.kmem.limit_in_bytes          This knob is deprecated and writing to
-                                     it will return -ENOTSUPP.
  memory.kmem.usage_in_bytes          show current kernel memory allocation
  memory.kmem.failcnt                 show the number of kernel memory usage
 				     hits limits
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 4b27e245a055..a0d3ed8d02e2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3750,9 +3750,6 @@  static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
 	case _MEMSWAP:
 		counter = &memcg->memsw;
 		break;
-	case _KMEM:
-		counter = &memcg->kmem;
-		break;
 	case _TCP:
 		counter = &memcg->tcpmem;
 		break;
@@ -3913,10 +3910,6 @@  static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
 		case _MEMSWAP:
 			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
 			break;
-		case _KMEM:
-			/* kmem.limit_in_bytes is deprecated. */
-			ret = -EOPNOTSUPP;
-			break;
 		case _TCP:
 			ret = memcg_update_tcp_max(memcg, nr_pages);
 			break;
@@ -5132,12 +5125,6 @@  static struct cftype mem_cgroup_legacy_files[] = {
 		.seq_show = memcg_numa_stat_show,
 	},
 #endif
-	{
-		.name = "kmem.limit_in_bytes",
-		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
-		.write = mem_cgroup_write,
-		.read_u64 = mem_cgroup_read_u64,
-	},
 	{
 		.name = "kmem.usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),