diff mbox series

[v3] mm, memcg: fix error return value of mem_cgroup_css_alloc()

Message ID 1586192163-20099-1-git-send-email-laoar.shao@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v3] mm, memcg: fix error return value of mem_cgroup_css_alloc() | expand

Commit Message

Yafang Shao April 6, 2020, 4:56 p.m. UTC
When I run my memcg testcase which creates lots of memcgs, I found
there're unexpected out of memory logs while there're still enough
available free memory. The error log is,
mkdir: cannot create directory 'foo.65533': Cannot allocate memory

The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
-ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
should be -EBUSY "Device or resource busy". That is same with
memcg_alloc_cache_id().

As the errno really misled me, we should make it right. After this patch,
the error log will be,
mkdir: cannot create directory 'foo.65533': Device or resource busy

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 mm/memcontrol.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

Comments

Andrew Morton April 6, 2020, 11:23 p.m. UTC | #1
On Tue,  7 Apr 2020 00:56:03 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:

> When I run my memcg testcase which creates lots of memcgs, I found
> there're unexpected out of memory logs while there're still enough
> available free memory. The error log is,
> mkdir: cannot create directory 'foo.65533': Cannot allocate memory
> 
> The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> should be -EBUSY "Device or resource busy". That is same with
> memcg_alloc_cache_id().
> 
> As the errno really misled me, we should make it right. After this patch,
> the error log will be,
> mkdir: cannot create directory 'foo.65533': Device or resource busy

Thanks.

Was a -stable backport considered?
Yafang Shao April 7, 2020, 3:02 a.m. UTC | #2
On Tue, Apr 7, 2020 at 7:23 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Tue,  7 Apr 2020 00:56:03 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
>
> > When I run my memcg testcase which creates lots of memcgs, I found
> > there're unexpected out of memory logs while there're still enough
> > available free memory. The error log is,
> > mkdir: cannot create directory 'foo.65533': Cannot allocate memory
> >
> > The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> > -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> > should be -EBUSY "Device or resource busy". That is same with
> > memcg_alloc_cache_id().
> >
> > As the errno really misled me, we should make it right. After this patch,
> > the error log will be,
> > mkdir: cannot create directory 'foo.65533': Device or resource busy
>
> Thanks.
>
> Was a -stable backport considered?

I only backported to our kernel version 4.18, but I'm not sure whether
it will apply to -stable or not.
Seems this issue is introduced long time ago. I will try to backport
it to -stable.
Should I try it based on 5.5.y and 5.6.y only ? Or all the LTS kernel
versions ?

Thanks
Yafang
Andrew Morton April 7, 2020, 3:09 a.m. UTC | #3
On Tue, 7 Apr 2020 11:02:31 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:

> On Tue, Apr 7, 2020 at 7:23 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Tue,  7 Apr 2020 00:56:03 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > > When I run my memcg testcase which creates lots of memcgs, I found
> > > there're unexpected out of memory logs while there're still enough
> > > available free memory. The error log is,
> > > mkdir: cannot create directory 'foo.65533': Cannot allocate memory
> > >
> > > The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> > > -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> > > should be -EBUSY "Device or resource busy". That is same with
> > > memcg_alloc_cache_id().
> > >
> > > As the errno really misled me, we should make it right. After this patch,
> > > the error log will be,
> > > mkdir: cannot create directory 'foo.65533': Device or resource busy
> >
> > Thanks.
> >
> > Was a -stable backport considered?
> 
> I only backported to our kernel version 4.18, but I'm not sure whether
> it will apply to -stable or not.
> Seems this issue is introduced long time ago. I will try to backport
> it to -stable.
> Should I try it based on 5.5.y and 5.6.y only ? Or all the LTS kernel
> versions ?

What I'm asking (of you and of reviewers) is whether this issue is
sufficiently serious to require fixing in -stable kernels.  The
patch merging details can be sorted out later.
Yafang Shao April 7, 2020, 3:11 a.m. UTC | #4
On Tue, Apr 7, 2020 at 11:09 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Tue, 7 Apr 2020 11:02:31 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
>
> > On Tue, Apr 7, 2020 at 7:23 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > On Tue,  7 Apr 2020 00:56:03 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > > When I run my memcg testcase which creates lots of memcgs, I found
> > > > there're unexpected out of memory logs while there're still enough
> > > > available free memory. The error log is,
> > > > mkdir: cannot create directory 'foo.65533': Cannot allocate memory
> > > >
> > > > The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> > > > -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> > > > should be -EBUSY "Device or resource busy". That is same with
> > > > memcg_alloc_cache_id().
> > > >
> > > > As the errno really misled me, we should make it right. After this patch,
> > > > the error log will be,
> > > > mkdir: cannot create directory 'foo.65533': Device or resource busy
> > >
> > > Thanks.
> > >
> > > Was a -stable backport considered?
> >
> > I only backported to our kernel version 4.18, but I'm not sure whether
> > it will apply to -stable or not.
> > Seems this issue is introduced long time ago. I will try to backport
> > it to -stable.
> > Should I try it based on 5.5.y and 5.6.y only ? Or all the LTS kernel
> > versions ?
>
> What I'm asking (of you and of reviewers) is whether this issue is
> sufficiently serious to require fixing in -stable kernels.  The
> patch merging details can be sorted out later.
>
>

I think it is worth to cc stable, as this errno will mislead the user.

Thanks
Yafang
Michal Hocko April 7, 2020, 6:43 a.m. UTC | #5
[I have only now noticed there was another version posted. Please try to
wait a bit longer for other feedback before reposting a newer version]

On Tue 07-04-20 11:11:13, Yafang Shao wrote:
> On Tue, Apr 7, 2020 at 11:09 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Tue, 7 Apr 2020 11:02:31 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > > On Tue, Apr 7, 2020 at 7:23 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > > >
> > > > On Tue,  7 Apr 2020 00:56:03 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> > > >
> > > > > When I run my memcg testcase which creates lots of memcgs, I found
> > > > > there're unexpected out of memory logs while there're still enough
> > > > > available free memory. The error log is,
> > > > > mkdir: cannot create directory 'foo.65533': Cannot allocate memory
> > > > >
> > > > > The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> > > > > -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> > > > > should be -EBUSY "Device or resource busy". That is same with
> > > > > memcg_alloc_cache_id().

See my comment about EBUSY in the previous version
http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz

> > > > > As the errno really misled me, we should make it right. After this patch,
> > > > > the error log will be,
> > > > > mkdir: cannot create directory 'foo.65533': Device or resource busy
> > > >
> > > > Thanks.
> > > >
> > > > Was a -stable backport considered?
> > >
> > > I only backported to our kernel version 4.18, but I'm not sure whether
> > > it will apply to -stable or not.
> > > Seems this issue is introduced long time ago. I will try to backport
> > > it to -stable.
> > > Should I try it based on 5.5.y and 5.6.y only ? Or all the LTS kernel
> > > versions ?
> >
> > What I'm asking (of you and of reviewers) is whether this issue is
> > sufficiently serious to require fixing in -stable kernels.  The
> > patch merging details can be sorted out later.
> >
> >
> 
> I think it is worth to cc stable, as this errno will mislead the user.

We have idr failure path since 73f576c04b94 ("mm: memcontrol: fix cgroup
creation failure after many small jobs") which is more than four years
ago without anybody ever noticing. While I do agree that the existing
behavior might be confusing I am not really sure this qualifies as
stable material TBH.
Yafang Shao April 7, 2020, 9:31 a.m. UTC | #6
On Tue, Apr 7, 2020 at 2:43 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> [I have only now noticed there was another version posted. Please try to
> wait a bit longer for other feedback before reposting a newer version]
>

Sure I will. sorry about that.
But after we send a patch, we always don't know in which state this
patch is, e.g. Changes Requested, Not Applicable, Rejected, Needs
Review / ACK and etc, as listed in netdev list[1]. That could give us
an explicit instruction on what to do next.

[1]. https://patchwork.ozlabs.org/project/netdev/list/?state=*&page=2&param=3

> On Tue 07-04-20 11:11:13, Yafang Shao wrote:
> > On Tue, Apr 7, 2020 at 11:09 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > On Tue, 7 Apr 2020 11:02:31 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > > On Tue, Apr 7, 2020 at 7:23 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > > > >
> > > > > On Tue,  7 Apr 2020 00:56:03 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> > > > >
> > > > > > When I run my memcg testcase which creates lots of memcgs, I found
> > > > > > there're unexpected out of memory logs while there're still enough
> > > > > > available free memory. The error log is,
> > > > > > mkdir: cannot create directory 'foo.65533': Cannot allocate memory
> > > > > >
> > > > > > The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> > > > > > -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> > > > > > should be -EBUSY "Device or resource busy". That is same with
> > > > > > memcg_alloc_cache_id().
>
> See my comment about EBUSY in the previous version
> http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz
>

From the perspective of mkdir(2), it is better to use ENOSPC here. But
I'm not sure whether it is better to update the man page of mkdir(2)
or not.

I checked the explaination about ENOSPC in 73f576c04b94 ("mm:
memcontrol: fix cgroup creation failure after many small jobs")
carefully, but I don't a clear idea which one is better now.

Per Matthew, EBUSY is better, while per you, ENOSPC is better.

Matthew, would you mind to give more detailed explanation ?

> > > > > > As the errno really misled me, we should make it right. After this patch,
> > > > > > the error log will be,
> > > > > > mkdir: cannot create directory 'foo.65533': Device or resource busy
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Was a -stable backport considered?
> > > >
> > > > I only backported to our kernel version 4.18, but I'm not sure whether
> > > > it will apply to -stable or not.
> > > > Seems this issue is introduced long time ago. I will try to backport
> > > > it to -stable.
> > > > Should I try it based on 5.5.y and 5.6.y only ? Or all the LTS kernel
> > > > versions ?
> > >
> > > What I'm asking (of you and of reviewers) is whether this issue is
> > > sufficiently serious to require fixing in -stable kernels.  The
> > > patch merging details can be sorted out later.
> > >
> > >
> >
> > I think it is worth to cc stable, as this errno will mislead the user.
>
> We have idr failure path since 73f576c04b94 ("mm: memcontrol: fix cgroup
> creation failure after many small jobs") which is more than four years
> ago without anybody ever noticing. While I do agree that the existing
> behavior might be confusing I am not really sure this qualifies as
> stable material TBH.

OK

Thanks
Yafang
Michal Hocko April 7, 2020, 11:10 a.m. UTC | #7
On Tue 07-04-20 17:31:33, Yafang Shao wrote:
> On Tue, Apr 7, 2020 at 2:43 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > [I have only now noticed there was another version posted. Please try to
> > wait a bit longer for other feedback before reposting a newer version]
> >
> 
> Sure I will. sorry about that.
> But after we send a patch, we always don't know in which state this
> patch is, e.g. Changes Requested, Not Applicable, Rejected, Needs
> Review / ACK and etc, as listed in netdev list[1]. That could give us
> an explicit instruction on what to do next.

Yes I can see how that can be unclear or consuming at times. But a
general rule of thumb I would suggest is to simply wait few days before
reposting a new version. If you need to discuss an incremental change
based on the review feedback you can always do that in the original
email thread and repost the new version after the review feedback
settles.

> [1]. https://patchwork.ozlabs.org/project/netdev/list/?state=*&page=2&param=3
> 
> > On Tue 07-04-20 11:11:13, Yafang Shao wrote:
[...]
> > See my comment about EBUSY in the previous version
> > http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz
> >
> 
> >From the perspective of mkdir(2), it is better to use ENOSPC here. But
> I'm not sure whether it is better to update the man page of mkdir(2)
> or not.

Well, I do not know whether EBUSY missing in the man page is an omission
or an intention. But adding it only for this single operation failure
would sound like an overkill to me.

> I checked the explaination about ENOSPC in 73f576c04b94 ("mm:
> memcontrol: fix cgroup creation failure after many small jobs")
> carefully, but I don't a clear idea which one is better now.

The changelog simply mentioned that without the additional id tracking
the ENOSPC would have been returned from elsewhere. I haven't checked
but I suspect it would be from the cgroup core. This patch just didn't
propagate the idr failure specifically and hid it under the ENOMEM
failure path. I can only speculate why that was the case but I suspect
that Johannes simply didn't consider the distinction important enough.
All I am saying is that preserving the failure mode would be preferable.
Especially unless there a strong reason to use EBUSY which then should
be carefully explained in the changelog.
Johannes Weiner April 7, 2020, 6:10 p.m. UTC | #8
On Tue, Apr 07, 2020 at 01:10:17PM +0200, Michal Hocko wrote:
> On Tue 07-04-20 17:31:33, Yafang Shao wrote:
> > I checked the explaination about ENOSPC in 73f576c04b94 ("mm:
> > memcontrol: fix cgroup creation failure after many small jobs")
> > carefully, but I don't a clear idea which one is better now.
> 
> The changelog simply mentioned that without the additional id tracking
> the ENOSPC would have been returned from elsewhere. I haven't checked
> but I suspect it would be from the cgroup core. This patch just didn't
> propagate the idr failure specifically and hid it under the ENOMEM
> failure path. I can only speculate why that was the case but I suspect
> that Johannes simply didn't consider the distinction important enough.

The old -ENOSPC came directly from the memcg css online callback:

-       if (css->id > MEM_CGROUP_ID_MAX)
-               return -ENOSPC;

And it only became a problem because, on big memory machines (128+G)
with a high rate of short-lived jobs, lazily freed cgroups piled up
over the course of multiple days and clogged up the ID space. Nobody
actually tried to create 64k user-visible cgroups concurrently. It's
hard to imagine any machine running that many meaningfully distinct
memory consumers in parallel. So I didn't think (and still don't tbh)
it matters all that much in practice what we return here.

I'm not against changing it back to -ENOSPC. And I agree with Michal
it might be better than -EBUSY because of the mkdir() interface.

Given how unlikely this is to affect real setups, I don't think this
patch is stable material.
Andrew Morton April 9, 2020, 1:29 a.m. UTC | #9
On Tue, 7 Apr 2020 14:10:12 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:

> Given how unlikely this is to affect real setups, I don't think this
> patch is stable material.

OK, thanks, done.

I didn't notice any acks - is this patch otherwise good to go upstream?
Michal Hocko April 9, 2020, 6:57 a.m. UTC | #10
On Wed 08-04-20 18:29:56, Andrew Morton wrote:
> On Tue, 7 Apr 2020 14:10:12 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:
> 
> > Given how unlikely this is to affect real setups, I don't think this
> > patch is stable material.
> 
> OK, thanks, done.
> 
> I didn't notice any acks - is this patch otherwise good to go upstream?

I will ack it if s@EBUSY@ENOSPC@
Yafang Shao April 9, 2020, 1:59 p.m. UTC | #11
On Thu, Apr 9, 2020 at 2:57 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Wed 08-04-20 18:29:56, Andrew Morton wrote:
> > On Tue, 7 Apr 2020 14:10:12 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:
> >
> > > Given how unlikely this is to affect real setups, I don't think this
> > > patch is stable material.
> >
> > OK, thanks, done.
> >
> > I didn't notice any acks - is this patch otherwise good to go upstream?
>
> I will ack it if s@EBUSY@ENOSPC@
>

BTW, there's no EBUSY in man page of write(2) neither.
But when we echo some procs into cgroup.subtree_control of a non-root
cgroup, it will return EBUSY. See also cgroup_subtree_control_write().
Pls. correct me if I missed something.

We have to admit that the man page is out of date.

Thanks
Yafang
Michal Hocko April 9, 2020, 2:07 p.m. UTC | #12
On Thu 09-04-20 21:59:42, Yafang Shao wrote:
> On Thu, Apr 9, 2020 at 2:57 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Wed 08-04-20 18:29:56, Andrew Morton wrote:
> > > On Tue, 7 Apr 2020 14:10:12 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:
> > >
> > > > Given how unlikely this is to affect real setups, I don't think this
> > > > patch is stable material.
> > >
> > > OK, thanks, done.
> > >
> > > I didn't notice any acks - is this patch otherwise good to go upstream?
> >
> > I will ack it if s@EBUSY@ENOSPC@
> >
> 
> BTW, there's no EBUSY in man page of write(2) neither.
> But when we echo some procs into cgroup.subtree_control of a non-root
> cgroup, it will return EBUSY. See also cgroup_subtree_control_write().
> Pls. correct me if I missed something.
> 
> We have to admit that the man page is out of date.

Or the code is returning something unexpected but nobody has noticed so
far. Please talk to cgroup maintainers on how to address that.
Andrew Morton April 20, 2020, 11:44 p.m. UTC | #13
On Thu, 9 Apr 2020 16:07:29 +0200 Michal Hocko <mhocko@kernel.org> wrote:

> On Thu 09-04-20 21:59:42, Yafang Shao wrote:
> > On Thu, Apr 9, 2020 at 2:57 PM Michal Hocko <mhocko@kernel.org> wrote:
> > >
> > > On Wed 08-04-20 18:29:56, Andrew Morton wrote:
> > > > On Tue, 7 Apr 2020 14:10:12 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:
> > > >
> > > > > Given how unlikely this is to affect real setups, I don't think this
> > > > > patch is stable material.
> > > >
> > > > OK, thanks, done.
> > > >
> > > > I didn't notice any acks - is this patch otherwise good to go upstream?
> > >
> > > I will ack it if s@EBUSY@ENOSPC@
> > >
> > 
> > BTW, there's no EBUSY in man page of write(2) neither.
> > But when we echo some procs into cgroup.subtree_control of a non-root
> > cgroup, it will return EBUSY. See also cgroup_subtree_control_write().
> > Pls. correct me if I missed something.
> > 
> > We have to admit that the man page is out of date.
> 
> Or the code is returning something unexpected but nobody has noticed so
> far. Please talk to cgroup maintainers on how to address that.

I agree with your ENOSPC reasoning in
http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz

That means this change, methinks:

--- a/mm/memcontrol.c~mm-memcg-fix-error-return-value-of-mem_cgroup_css_alloc-fix
+++ a/mm/memcontrol.c
@@ -2721,8 +2721,6 @@ static int memcg_alloc_cache_id(void)
 
 	id = ida_simple_get(&memcg_cache_ida,
 			    0, MEMCG_CACHES_MAX_SIZE, GFP_KERNEL);
-	if (id == -ENOSPC)
-		return -EBUSY;
 	if (id < 0)
 		return id;
 
@@ -5004,10 +5002,6 @@ static struct mem_cgroup *mem_cgroup_all
 	memcg->id.id = idr_alloc(&mem_cgroup_idr, NULL,
 				 1, MEM_CGROUP_ID_MAX,
 				 GFP_KERNEL);
-	if (memcg->id.id == -ENOSPC) {
-		error = -EBUSY;
-		goto fail;
-	}
 	if (memcg->id.id < 0) {
 		error = memcg->id.id;
 		goto fail;
Johannes Weiner April 21, 2020, 2:44 p.m. UTC | #14
On Mon, Apr 20, 2020 at 04:44:37PM -0700, Andrew Morton wrote:
> From: Yafang Shao <laoar.shao@gmail.com>
> Subject: mm, memcg: fix error return value of mem_cgroup_css_alloc()
> 
> When I run my memcg testcase which creates lots of memcgs, I found
> there're unexpected out of memory logs while there're still enough
> available free memory.  The error log is, mkdir: cannot create directory
> 'foo.65533': Cannot allocate memory
> 
> The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs, an
> -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right errno
> should be -ENOSPC "No space left on device", which is an appropriate errno
> for userspace's failed mkdir.
> 
> As the errno really misled me, we should make it right.  After this patch,
> the error log will be "mkdir: cannot create directory 'foo.65533': No
> space left on device"
> 
> [akpm@linux-foundation.org: s/EBUSY/ENOSPC/, per Michal]
> Link: http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz
> Link: http://lkml.kernel.org/r/1586192163-20099-1-git-send-email-laoar.shao@gmail.com
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Acked-by: Michal Hocko <mhocko@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Fixes: 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs")
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
diff mbox series

Patch

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ca194864d802..042af0bc4a05 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2717,6 +2717,8 @@  static int memcg_alloc_cache_id(void)
 
 	id = ida_simple_get(&memcg_cache_ida,
 			    0, MEMCG_CACHES_MAX_SIZE, GFP_KERNEL);
+	if (id == -ENOSPC)
+		return -EBUSY;
 	if (id < 0)
 		return id;
 
@@ -4986,19 +4988,26 @@  static struct mem_cgroup *mem_cgroup_alloc(void)
 	unsigned int size;
 	int node;
 	int __maybe_unused i;
+	long error = -ENOMEM;
 
 	size = sizeof(struct mem_cgroup);
 	size += nr_node_ids * sizeof(struct mem_cgroup_per_node *);
 
 	memcg = kzalloc(size, GFP_KERNEL);
 	if (!memcg)
-		return NULL;
+		return ERR_PTR(error);
 
 	memcg->id.id = idr_alloc(&mem_cgroup_idr, NULL,
 				 1, MEM_CGROUP_ID_MAX,
 				 GFP_KERNEL);
-	if (memcg->id.id < 0)
+	if (memcg->id.id == -ENOSPC) {
+		error = -EBUSY;
+		goto fail;
+	}
+	if (memcg->id.id < 0) {
+		error = memcg->id.id;
 		goto fail;
+	}
 
 	memcg->vmstats_local = alloc_percpu(struct memcg_vmstats_percpu);
 	if (!memcg->vmstats_local)
@@ -5042,7 +5051,7 @@  static struct mem_cgroup *mem_cgroup_alloc(void)
 fail:
 	mem_cgroup_id_remove(memcg);
 	__mem_cgroup_free(memcg);
-	return NULL;
+	return ERR_PTR(error);
 }
 
 static struct cgroup_subsys_state * __ref
@@ -5053,8 +5062,8 @@  mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 	long error = -ENOMEM;
 
 	memcg = mem_cgroup_alloc();
-	if (!memcg)
-		return ERR_PTR(error);
+	if (IS_ERR(memcg))
+		return ERR_CAST(memcg);
 
 	WRITE_ONCE(memcg->high, PAGE_COUNTER_MAX);
 	memcg->soft_limit = PAGE_COUNTER_MAX;
@@ -5104,7 +5113,7 @@  mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 fail:
 	mem_cgroup_id_remove(memcg);
 	mem_cgroup_free(memcg);
-	return ERR_PTR(-ENOMEM);
+	return ERR_PTR(error);
 }
 
 static int mem_cgroup_css_online(struct cgroup_subsys_state *css)