diff mbox series

ucounts: Fix regression preventing increasing of rlimits in init_user_ns

Message ID 87eeajswfc.fsf_-_@disp2133 (mailing list archive)
State Accepted
Commit 5ddf994fa22f78ae3742d72520a8c3e8521d96cd
Headers show
Series ucounts: Fix regression preventing increasing of rlimits in init_user_ns | expand

Commit Message

Eric W. Biederman Aug. 23, 2021, 9:06 p.m. UTC
"Ma, XinjianX" <xinjianx.ma@intel.com> reported:

> When lkp team run kernel selftests, we found after these series of patches, testcase mqueue: mq_perf_tests
> in kselftest failed with following message.
>
> # selftests: mqueue: mq_perf_tests
> #
> # Initial system state:
> #       Using queue path:                       /mq_perf_tests
> #       RLIMIT_MSGQUEUE(soft):                  819200
> #       RLIMIT_MSGQUEUE(hard):                  819200
> #       Maximum Message Size:                   8192
> #       Maximum Queue Size:                     10
> #       Nice value:                             0
> #
> # Adjusted system state for testing:
> #       RLIMIT_MSGQUEUE(soft):                  (unlimited)
> #       RLIMIT_MSGQUEUE(hard):                  (unlimited)
> #       Maximum Message Size:                   16777216
> #       Maximum Queue Size:                     65530
> #       Nice value:                             -20
> #       Continuous mode:                        (disabled)
> #       CPUs to pin:                            3
> # ./mq_perf_tests: mq_open() at 296: Too many open files
> not ok 2 selftests: mqueue: mq_perf_tests # exit=1
> ```
>
> Test env:
> rootfs: debian-10
> gcc version: 9

After investigation the problem turned out to be that ucount_max for
the rlimits in init_user_ns was being set to the initial rlimit value.
The practical problem is that ucount_max provides a limit that
applications inside the user namespace can not exceed.  Which means in
practice that rlimits that have been converted to use the ucount
infrastructure were not able to exceend their initial rlimits.

Solve this by setting the relevant values of ucount_max to
RLIM_INIFINITY.  A limit in init_user_ns is pointless so the code
should allow the values to grow as large as possible without riscking
an underflow or an overflow.

As the ltp test case was a bit of a pain I have reproduced the rlimit failure
and tested the fix with the following little C program:
> #include <stdio.h>
> #include <fcntl.h>
> #include <sys/stat.h>
> #include <mqueue.h>
> #include <sys/time.h>
> #include <sys/resource.h>
> #include <errno.h>
> #include <string.h>
> #include <stdlib.h>
> #include <limits.h>
> #include <unistd.h>
>
> int main(int argc, char **argv)
> {
> 	struct mq_attr mq_attr;
> 	struct rlimit rlim;
> 	mqd_t mqd;
> 	int ret;
>
> 	ret = getrlimit(RLIMIT_MSGQUEUE, &rlim);
> 	if (ret != 0) {
> 		fprintf(stderr, "getrlimit(RLIMIT_MSGQUEUE) failed: %s\n", strerror(errno));
> 		exit(EXIT_FAILURE);
> 	}
> 	printf("RLIMIT_MSGQUEUE %lu %lu\n",
> 	       rlim.rlim_cur, rlim.rlim_max);
> 	rlim.rlim_cur = RLIM_INFINITY;
> 	rlim.rlim_max = RLIM_INFINITY;
> 	ret = setrlimit(RLIMIT_MSGQUEUE, &rlim);
> 	if (ret != 0) {
> 		fprintf(stderr, "setrlimit(RLIMIT_MSGQUEUE, RLIM_INFINITY) failed: %s\n", strerror(errno));
> 		exit(EXIT_FAILURE);
> 	}
>
> 	memset(&mq_attr, 0, sizeof(struct mq_attr));
> 	mq_attr.mq_maxmsg = 65536 - 1;
> 	mq_attr.mq_msgsize = 16*1024*1024 - 1;
>
> 	mqd = mq_open("/mq_rlimit_test", O_RDONLY|O_CREAT, 0600, &mq_attr);
> 	if (mqd == (mqd_t)-1) {
> 		fprintf(stderr, "mq_open failed: %s\n", strerror(errno));
> 		exit(EXIT_FAILURE);
> 	}
> 	ret = mq_close(mqd);
> 	if (ret) {
> 		fprintf(stderr, "mq_close failed; %s\n", strerror(errno));
> 		exit(EXIT_FAILURE);
> 	}
>
> 	return EXIT_SUCCESS;
> }

Fixes: 6e52a9f0532f ("Reimplement RLIMIT_MSGQUEUE on top of ucounts")
Fixes: d7c9e99aee48 ("Reimplement RLIMIT_MEMLOCK on top of ucounts")
Fixes: d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts")
Reported-by: kernel test robot lkp@intel.com
Acked-by: Alexey Gladkov <legion@kernel.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---

This is a simplified version of my previous change that I have tested
and will push out to linux-next and then to Linus shortly.

 kernel/fork.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Ma Xinjian Aug. 24, 2021, 1:19 a.m. UTC | #1
> -----Original Message-----
> From: Eric W. Biederman <ebiederm@xmission.com>
> Sent: Tuesday, August 24, 2021 5:07 AM
> To: Alexey Gladkov <legion@kernel.org>
> Cc: Ma, XinjianX <xinjianx.ma@intel.com>; linux-kselftest@vger.kernel.org;
> lkp <lkp@intel.com>; akpm@linux-foundation.org; axboe@kernel.dk;
> christian.brauner@ubuntu.com; containers@lists.linux-foundation.org;
> jannh@google.com; keescook@chromium.org; kernel-
> hardening@lists.openwall.com; linux-kernel@vger.kernel.org; linux-
> mm@kvack.org; oleg@redhat.com; torvalds@linux-foundation.org
> Subject: [PATCH] ucounts: Fix regression preventing increasing of rlimits in
> init_user_ns
> 
> 
> "Ma, XinjianX" <xinjianx.ma@intel.com> reported:
> 
> > When lkp team run kernel selftests, we found after these series of
> > patches, testcase mqueue: mq_perf_tests in kselftest failed with following
> message.
> >
> > # selftests: mqueue: mq_perf_tests
> > #
> > # Initial system state:
> > #       Using queue path:                       /mq_perf_tests
> > #       RLIMIT_MSGQUEUE(soft):                  819200
> > #       RLIMIT_MSGQUEUE(hard):                  819200
> > #       Maximum Message Size:                   8192
> > #       Maximum Queue Size:                     10
> > #       Nice value:                             0
> > #
> > # Adjusted system state for testing:
> > #       RLIMIT_MSGQUEUE(soft):                  (unlimited)
> > #       RLIMIT_MSGQUEUE(hard):                  (unlimited)
> > #       Maximum Message Size:                   16777216
> > #       Maximum Queue Size:                     65530
> > #       Nice value:                             -20
> > #       Continuous mode:                        (disabled)
> > #       CPUs to pin:                            3
> > # ./mq_perf_tests: mq_open() at 296: Too many open files not ok 2
> > selftests: mqueue: mq_perf_tests # exit=1 ```
> >
> > Test env:
> > rootfs: debian-10
> > gcc version: 9
> 
> After investigation the problem turned out to be that ucount_max for the
> rlimits in init_user_ns was being set to the initial rlimit value.
> The practical problem is that ucount_max provides a limit that applications
> inside the user namespace can not exceed.  Which means in practice that
> rlimits that have been converted to use the ucount infrastructure were not
> able to exceend their initial rlimits.
> 
> Solve this by setting the relevant values of ucount_max to RLIM_INIFINITY.  A
> limit in init_user_ns is pointless so the code should allow the values to grow
> as large as possible without riscking an underflow or an overflow.
> 
> As the ltp test case was a bit of a pain I have reproduced the rlimit failure and
> tested the fix with the following little C program:
> > #include <stdio.h>
> > #include <fcntl.h>
> > #include <sys/stat.h>
> > #include <mqueue.h>
> > #include <sys/time.h>
> > #include <sys/resource.h>
> > #include <errno.h>
> > #include <string.h>
> > #include <stdlib.h>
> > #include <limits.h>
> > #include <unistd.h>
> >
> > int main(int argc, char **argv)
> > {
> > 	struct mq_attr mq_attr;
> > 	struct rlimit rlim;
> > 	mqd_t mqd;
> > 	int ret;
> >
> > 	ret = getrlimit(RLIMIT_MSGQUEUE, &rlim);
> > 	if (ret != 0) {
> > 		fprintf(stderr, "getrlimit(RLIMIT_MSGQUEUE) failed: %s\n",
> strerror(errno));
> > 		exit(EXIT_FAILURE);
> > 	}
> > 	printf("RLIMIT_MSGQUEUE %lu %lu\n",
> > 	       rlim.rlim_cur, rlim.rlim_max);
> > 	rlim.rlim_cur = RLIM_INFINITY;
> > 	rlim.rlim_max = RLIM_INFINITY;
> > 	ret = setrlimit(RLIMIT_MSGQUEUE, &rlim);
> > 	if (ret != 0) {
> > 		fprintf(stderr, "setrlimit(RLIMIT_MSGQUEUE, RLIM_INFINITY)
> failed: %s\n", strerror(errno));
> > 		exit(EXIT_FAILURE);
> > 	}
> >
> > 	memset(&mq_attr, 0, sizeof(struct mq_attr));
> > 	mq_attr.mq_maxmsg = 65536 - 1;
> > 	mq_attr.mq_msgsize = 16*1024*1024 - 1;
> >
> > 	mqd = mq_open("/mq_rlimit_test", O_RDONLY|O_CREAT, 0600,
> &mq_attr);
> > 	if (mqd == (mqd_t)-1) {
> > 		fprintf(stderr, "mq_open failed: %s\n", strerror(errno));
> > 		exit(EXIT_FAILURE);
> > 	}
> > 	ret = mq_close(mqd);
> > 	if (ret) {
> > 		fprintf(stderr, "mq_close failed; %s\n", strerror(errno));
> > 		exit(EXIT_FAILURE);
> > 	}
> >
> > 	return EXIT_SUCCESS;
> > }
> 
> Fixes: 6e52a9f0532f ("Reimplement RLIMIT_MSGQUEUE on top of ucounts")
> Fixes: d7c9e99aee48 ("Reimplement RLIMIT_MEMLOCK on top of ucounts")
> Fixes: d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
> Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts")
> Reported-by: kernel test robot lkp@intel.com
Sorry, but <> around email address is needed 
Reported-by: kernel test robot <lkp@intel.com>

> Acked-by: Alexey Gladkov <legion@kernel.org>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
> 
> This is a simplified version of my previous change that I have tested and will
> push out to linux-next and then to Linus shortly.
> 
>  kernel/fork.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c index bc94b2cc5995..44f4c2d83763
> 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -828,10 +828,10 @@ void __init fork_init(void)
>  	for (i = 0; i < MAX_PER_NAMESPACE_UCOUNTS; i++)
>  		init_user_ns.ucount_max[i] = max_threads/2;
> 
> -	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_NPROC,
> task_rlimit(&init_task, RLIMIT_NPROC));
> -	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE,
> task_rlimit(&init_task, RLIMIT_MSGQUEUE));
> -	set_rlimit_ucount_max(&init_user_ns,
> UCOUNT_RLIMIT_SIGPENDING, task_rlimit(&init_task, RLIMIT_SIGPENDING));
> -	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MEMLOCK,
> task_rlimit(&init_task, RLIMIT_MEMLOCK));
> +	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_NPROC,
> RLIM_INFINITY);
> +	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE,
> RLIM_INFINITY);
> +	set_rlimit_ucount_max(&init_user_ns,
> UCOUNT_RLIMIT_SIGPENDING, RLIM_INFINITY);
> +	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MEMLOCK,
> RLIM_INFINITY);
> 
>  #ifdef CONFIG_VMAP_STACK
>  	cpuhp_setup_state(CPUHP_BP_PREPARE_DYN,
> "fork:vm_stack_cache",
> --
> 2.20.1
Eric W. Biederman Aug. 24, 2021, 3:24 a.m. UTC | #2
"Ma, XinjianX" <xinjianx.ma@intel.com> writes:

>> -----Original Message-----
>> From: Eric W. Biederman <ebiederm@xmission.com>
>> ...
>> Reported-by: kernel test robot lkp@intel.com
> Sorry, but <> around email address is needed 
> Reported-by: kernel test robot <lkp@intel.com>

The change is already tested and pushed out so I really don't want to
mess with it.  Especially as I am aiming to send it to Linus on
Wednesday after it has had a chance to pass through linux-next and
whatever automated tests are there.

What does copying and pasting the Reported-by: tag as included in
your original report cause to break?

At this point I suspect that the danger of fat fingering something
far outweighs whatever benefits might be gained by surrounding the
email address with <> marks.

Eric
diff mbox series

Patch

diff --git a/kernel/fork.c b/kernel/fork.c
index bc94b2cc5995..44f4c2d83763 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -828,10 +828,10 @@  void __init fork_init(void)
 	for (i = 0; i < MAX_PER_NAMESPACE_UCOUNTS; i++)
 		init_user_ns.ucount_max[i] = max_threads/2;
 
-	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_NPROC, task_rlimit(&init_task, RLIMIT_NPROC));
-	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, task_rlimit(&init_task, RLIMIT_MSGQUEUE));
-	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_SIGPENDING, task_rlimit(&init_task, RLIMIT_SIGPENDING));
-	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MEMLOCK, task_rlimit(&init_task, RLIMIT_MEMLOCK));
+	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_NPROC,      RLIM_INFINITY);
+	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE,   RLIM_INFINITY);
+	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_SIGPENDING, RLIM_INFINITY);
+	set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MEMLOCK,    RLIM_INFINITY);
 
 #ifdef CONFIG_VMAP_STACK
 	cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "fork:vm_stack_cache",