coredump: fix unfreezable coredumping task

Message ID	1475225434-3753-1-git-send-email-aryabinin@virtuozzo.com (mailing list archive)
State	Not Applicable, archived
Headers	show Return-Path: <linux-pm-owner@kernel.org> From: Andrey Ryabinin <aryabinin@virtuozzo.com> To: Alexander Viro <viro@zeniv.linux.org.uk>, Tejun Heo <tj@kernel.org>, "Rafael J. Wysocki" <rjw@rjwysocki.net>, Pavel Machek <pavel@ucw.cz>, "Oleg Nesterov" <oleg@redhat.com> CC: <linux-pm@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>, <linux-kernel@vger.kernel.org>, Andrey Ryabinin <aryabinin@virtuozzo.com>, <stable@vger.kernel.org> Subject: [PATCH] coredump: fix unfreezable coredumping task Date: Fri, 30 Sep 2016 11:50:34 +0300 Message-ID: <1475225434-3753-1-git-send-email-aryabinin@virtuozzo.com> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: None (protection.outlook.com: virtuozzo.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; AM5PR0802MB2547; 23:DpZE36/ltpZaTIMcfCLmGFeS/m1n+U9ta8ES3Jr?= =?us-ascii?Q?wMM+k463DNjw+3jUhshZ+bQMxP7gGmcCgA+3G6119kOmTBgmqdgJ8OOiqf6k?= =?us-ascii?Q?kvZ1zZwdf7brtq84gzXKghLfzKIaAoM3UEiwplMujx6EEhsXRxGZpTAXeRvr?= =?us-ascii?Q?2vAM4jKbmw12Gqglm05oIQexresiZvNGtn+T1yqiwP8SxIptc0FJyoMmAhgQ?= =?us-ascii?Q?9l94L4dixavLbeLqFmFMxoNInDtxjZ4VcENQS+eIDNinwhrz23b3SNhnDKhK?= =?us-ascii?Q?MOpzClqkw6l+/19v52hfv935pbF4OAVaSFYoVYQL0X/IalWT2DKzJGX9DqWf?= =?us-ascii?Q?bcUoam/Pb6KFucQGWUjz/XmBWK/TvXjObv3/PZpshQBkBToEys2aJoqguLML?= =?us-ascii?Q?xEemTk6H9Ki8hvmWGyYjxlyQ+8q2QLD5BXZSeKyR/RLnCez57ftkCHV1s1yK?= =?us-ascii?Q?oHA4jQQ1p6To/DNpGxYUNSf4T4aZ6ipuk8ZElfOFMyg9NAc3LdLKgqGLlETN?= =?us-ascii?Q?f4XFv9X4JsNVfZaj2/hcZdQApt5UcF2ZHo7473OohOF0GdLjoSe3RnYRSKFq?= =?us-ascii?Q?obdVGpNKs5BiPUJB7fd1eIez8gR/kFkO/D8M5bpmlWjiUrXyZJrinoKCmDrb?= =?us-ascii?Q?c5ACeQtzec8unkJCK+aYcNqCaX9g4w16Y7sHBIyHBJCYw+DQlKPZFSfsmfLT?= =?us-ascii?Q?wmtYCknJnzcaxSyx5nKMbfpU5e2ckqGI7IK17f0ccFFNVhtxEPvoFIZcYS5y?= =?us-ascii?Q?L9w+OieoYHEZ3wA6ZmActR8V2kMg1hYfOD+XMIxaxczP79nw2wjs2OTV9i6g?= =?us-ascii?Q?3gq2CXx+Jmj/h7XGIhLxO3RSn4wqQFaBy7IF6/godjF8EXF3CdC4b8ae9JkB?= =?us-ascii?Q?TKshyVAfVcUdEE38eP2QDw1Xn9PmzI6HVaAUfkLhe2ij2Hr0vBSTAjoVxuHF?= =?us-ascii?Q?f/Fr1w/uegqunYOiKuEBURqFSNRslUSSk/p1VclLCn6bAVZUwDfk+dPnGUrD?= =?us-ascii?Q?wU88Qippweix9drEE8qmscyDvZQTVb3at7oTjz29gL5Ys1zxLQyKrGQdhESB?= =?us-ascii?Q?9vsKc8Bonborah2CKYy3B6h9TfW8kgY4Lfm+IGWExH11Wl/syVhUtKAz+Swg?= =?us-ascii?Q?Dz11oB9vdJGb+02H1du0hJfgpfJErDMYnTLeYBu4rh+eQDhwDhmCcBD9PK0m?= =?us-ascii?Q?msgkHx81Y0UmHdCJP45R7UqHBMHzYbcGwKn65mEEoI0Fb8X+2zVdHpULy8C1?= =?us-ascii?Q?HoxG8jqyY8e6fz4PzKtXTyeoukaiIVUtS0qPdDtLl4Y4H5LKJjvXtslOL6Nl?= =?us-ascii?Q?AXlrHv9dnTZMkMPxTpeB8R6WS3+/YN/RwLeQpllvttReCBuRkgo3cwCCPXLQ?= =?us-ascii?Q?DlR3lUtNWlLfIjWhT+X47M/Yby8rK32cPW6g2E20HI0P6A3R5rcFLdV7uitD?= =?us-ascii?Q?M2jsfNePKJJ5efSevLLZIuKr0R09/4j4GveW1bCCORxtG+0uJBdMR5craHqq?= =?us-ascii?Q?HnxZTS/+bF4lKVCZ/hQfmVJKt7GLj1NsJuTEySJvBHhwf9Py/amyjgCA3zJw?= =?us-ascii?Q?w2hyQWcg3fj9vCe8mhFRKF0VCY7pOHq1GdGOtt+LPcwXiAptV2R/AqXcQnbt?= =?us-ascii?Q?JuW8fdUF4rhadsu+qiaKzmWrfipkDLAvnL7NBdLqqU3KzUBYOLWvwBRgfgfV?= =?us-ascii?Q?5HySr6iXRkzsmBXOgcCrtw7IbE7RKEg5w92VgwWSzMnWhFky6wm/aXaqmIfm?= =?us-ascii?Q?ndp2tf8rIIQbkNNFmMVFF54NfDkykEgsIlyjfD7Ba0w3Ct6pt7lkDuHuDOkC?= =?us-ascii?Q?1afY71Tmfp220mS3lqBGpz4zFrg4pzdnGgjoWgHy56dYw7FHw11KaymswvXt?= =?us-ascii?Q?KlkywVlMzDFtTuplQmcJ05W9eMlHoXVhMqSL3vNsmYK0Fi8qW4SMkyNeo1mD?= =?us-ascii?Q?0eeVOxb1hxZu++81f5zry?= X-Microsoft-Exchange-Diagnostics: 1; AM5PR0802MB2547; 6:ix+s07sToVm3ljAJyTGP4GQMRSJ08z9dMLJJVsaIxhJlryYcMyTLq/lNQCPvcslqQow0R7rkNWX6v1clTEjW5an+MRfc2gs9c1XBPUOCGXWhuVjPj+EF5DD6i5iXTrX4ydI8u7iZcW+c134DG+DPXa/fCi8msdBAhl0eSKQ2aOnQ9AWzKejw/LkXZ6/v3tGfToime7w86aPLeb3DXzkt6Ct1J5+K9UbUHTrrcMRVcJJmou3aY8EkU0I6LGi/5+AwEsoqxTa4HkD4VT4hrRztD2h7ZD9mtRuyoNB/b9GW9bcKW5DX0hxpsjMQBEYHn/d22FOiwk+55YI8BiC+MUWYhQ==; 5:V79HTmxNb/QbqByZed8+WgB3FJ8wihfoCAImEZYWpwYhHrgZBszzE4beKOo81BJXnUWkcKtCDG0E4lP1T6n6wj2KWFPL4MmMd31mXlwTQp2QVb4WMJGdXX6hkZFOqDNoB0WFmZC3cUWc4z/0YkUXc/pjPG3yLcG9UFUOKgejJOY=; 24:r8t/XowOJrz7O6d60INxhllDGQFsQtnO+P2y3V+VQcxcpAxgbSUFaz/V52P4HL4iQ/G035bQWCEXFhJwdwT20+MXuumEG504sTMYHCPZ9VA= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; AM5PR0802MB2547; 7:HAnkSELlxSCLa/ZLL96M+x1hs7VEVxBsEUYGEeepJ5q3gOyNxXpk9ysofQ/RNDc2ObSidgPFtEZqO6iHYqfO9Ku2QTHWIxa1OmeJIkAZPGXu6RbfOAmZRxT3ijiuH7kwcclQjtolzjm97ZF+10jdhKrT0Z2UtkQss6+fTF2a2VcPmCEE8/Sa6fQBJaAcTmm9kN4bFCDhdJE5c21eFkFT0sOAuljlf+5Dd2ym5981HpTzPakprbU2z0Ce2WNbhpHZ8jgTcsNXHdYwgAkMUzXjdHb4E2CLiRzYoMyXGx0HF7bcEgzekgHTwQuMdJyCvwEU+1+qVfqXBF21yNZQorr+CA==; 20:P9r5YUJKMHuzADsULqtzuSDm0aFDiSQn5T3M5UKMfJ+ShnH2ga7ZqmoM8OaK4HMI6a7ctJ8X2jf7J6A5KXOsHHGpSTIfwzyBAFcCr46rlevcgBcn5tK4xFR4CxtzRmaU5parrMon+Jcv6ih78UGsGOj7P21R2uQmYzxsCtGfFLs= Sender: linux-pm-owner@vger.kernel.org Precedence: bulk

Andrey Ryabinin Sept. 30, 2016, 8:50 a.m. UTC

It could be not possible to freeze coredumping task when it waits
for 'core_state->startup' completion, because threads are frozen
in get_signal() before they got a chance to complete 'core_state->startup'.

Use freezer_do_not_count() to tell freezer to ignore coredumping
task while it waits for core_state->startup completion.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: stable@vger.kernel.org
---
 fs/coredump.c | 3 +++
 1 file changed, 3 insertions(+)

Oleg Nesterov Sept. 30, 2016, 12:47 p.m. UTC | #1

On 09/30, Andrey Ryabinin wrote:
>
> @@ -423,7 +424,9 @@ static int coredump_wait(int exit_code, struct core_state *core_state)
>  	if (core_waiters > 0) {
>  		struct core_thread *ptr;
>  
> +		freezer_do_not_count();
>  		wait_for_completion(&core_state->startup);
> +		freezer_count();

Agreed... we could probably even do

	--- x/fs/coredump.c
	+++ x/fs/coredump.c
	@@ -423,7 +423,13 @@ static int coredump_wait(int exit_code, 
		if (core_waiters > 0) {
			struct core_thread *ptr;
	 
	-		wait_for_completion(&core_state->startup);
	+		if (wait_for_completion_interruptible(&core_state->startup)) {
	+			/* see the comment in dump_interrupted() */
	+			down_write(&mm->mmap_sem);
	+			coredump_finish(mm, false);
	+			up_write(&mm->mmap_sem);
	+			return -EINTR;
	+		}
			/*
			 * Wait for all the threads to become inactive, so that
			 * all the thread context (extended register state, like

but this change looks fine to me too.

Acked-by: Oleg Nesterov <oleg@redhat.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Pavel Machek Oct. 3, 2016, 9:41 a.m. UTC | #2

On Fri 2016-09-30 11:50:34, Andrey Ryabinin wrote:
> It could be not possible to freeze coredumping task when it waits
> for 'core_state->startup' completion, because threads are frozen
> in get_signal() before they got a chance to complete 'core_state->startup'.
> 
> Use freezer_do_not_count() to tell freezer to ignore coredumping
> task while it waits for core_state->startup completion.
> 
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: stable@vger.kernel.org

Acked-by: Pavel Machek <pavel@ucw.cz>

Michal Hocko Oct. 4, 2016, 7:18 a.m. UTC | #3

On Fri 30-09-16 14:47:41, Oleg Nesterov wrote:
> On 09/30, Andrey Ryabinin wrote:
> >
> > @@ -423,7 +424,9 @@ static int coredump_wait(int exit_code, struct core_state *core_state)
> >  	if (core_waiters > 0) {
> >  		struct core_thread *ptr;
> >  
> > +		freezer_do_not_count();
> >  		wait_for_completion(&core_state->startup);
> > +		freezer_count();
> 
> Agreed... we could probably even do
> 
> 	--- x/fs/coredump.c
> 	+++ x/fs/coredump.c
> 	@@ -423,7 +423,13 @@ static int coredump_wait(int exit_code, 
> 		if (core_waiters > 0) {
> 			struct core_thread *ptr;
> 	 
> 	-		wait_for_completion(&core_state->startup);
> 	+		if (wait_for_completion_interruptible(&core_state->startup)) {
> 	+			/* see the comment in dump_interrupted() */
> 	+			down_write(&mm->mmap_sem);
> 	+			coredump_finish(mm, false);
> 	+			up_write(&mm->mmap_sem);
> 	+			return -EINTR;
> 	+		}
> 			/*
> 			 * Wait for all the threads to become inactive, so that
> 			 * all the thread context (extended register state, like

This looks like a very good idea to me. We really want to make the whole
coredump_wait killable. I guess this should help us to remove the
hackish sig->flags & SIGNAL_GROUP_COREDUMP check from
__task_will_free_mem. Or are there any other problems that would make
oom victims in the middle of coredump problematic?

Oleg Nesterov Oct. 4, 2016, 4:13 p.m. UTC | #4

On 10/04, Michal Hocko wrote:
>
> On Fri 30-09-16 14:47:41, Oleg Nesterov wrote:
> > On 09/30, Andrey Ryabinin wrote:
> > >
> > > @@ -423,7 +424,9 @@ static int coredump_wait(int exit_code, struct core_state *core_state)
> > >  	if (core_waiters > 0) {
> > >  		struct core_thread *ptr;
> > >
> > > +		freezer_do_not_count();
> > >  		wait_for_completion(&core_state->startup);
> > > +		freezer_count();
> >
> > Agreed... we could probably even do
> >
> > 	--- x/fs/coredump.c
> > 	+++ x/fs/coredump.c
> > 	@@ -423,7 +423,13 @@ static int coredump_wait(int exit_code, 
> > 		if (core_waiters > 0) {
> > 			struct core_thread *ptr;
> > 	 
> > 	-		wait_for_completion(&core_state->startup);
> > 	+		if (wait_for_completion_interruptible(&core_state->startup)) {
> > 	+			/* see the comment in dump_interrupted() */
> > 	+			down_write(&mm->mmap_sem);
> > 	+			coredump_finish(mm, false);
> > 	+			up_write(&mm->mmap_sem);
> > 	+			return -EINTR;
> > 	+		}
> > 			/*
> > 			 * Wait for all the threads to become inactive, so that
> > 			 * all the thread context (extended register state, like
>
> This looks like a very good idea to me. We really want to make the whole
> coredump_wait killable.

Well, it is already killable. And with the change above it can sleep
in down_write(mmap_sem) and we really need this lock to abort, so it
won't necessarily react to SIGKILL faster.

> I guess this should help us to remove the
> hackish sig->flags & SIGNAL_GROUP_COREDUMP check from
> __task_will_free_mem.

Why? This doesn't depend on "killable". __task_will_free_mem() checks
this flag to detect the CLONE_VM processes which won't exit soon because
they participate in the coredumping.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Michal Hocko Oct. 5, 2016, 9:17 a.m. UTC | #5

On Tue 04-10-16 18:13:05, Oleg Nesterov wrote:
> On 10/04, Michal Hocko wrote:
> >
> > On Fri 30-09-16 14:47:41, Oleg Nesterov wrote:
> > > On 09/30, Andrey Ryabinin wrote:
> > > >
> > > > @@ -423,7 +424,9 @@ static int coredump_wait(int exit_code, struct core_state *core_state)
> > > >  	if (core_waiters > 0) {
> > > >  		struct core_thread *ptr;
> > > >
> > > > +		freezer_do_not_count();
> > > >  		wait_for_completion(&core_state->startup);
> > > > +		freezer_count();
> > >
> > > Agreed... we could probably even do
> > >
> > > 	--- x/fs/coredump.c
> > > 	+++ x/fs/coredump.c
> > > 	@@ -423,7 +423,13 @@ static int coredump_wait(int exit_code, 
> > > 		if (core_waiters > 0) {
> > > 			struct core_thread *ptr;
> > > 	 
> > > 	-		wait_for_completion(&core_state->startup);
> > > 	+		if (wait_for_completion_interruptible(&core_state->startup)) {
> > > 	+			/* see the comment in dump_interrupted() */
> > > 	+			down_write(&mm->mmap_sem);
> > > 	+			coredump_finish(mm, false);
> > > 	+			up_write(&mm->mmap_sem);
> > > 	+			return -EINTR;
> > > 	+		}
> > > 			/*
> > > 			 * Wait for all the threads to become inactive, so that
> > > 			 * all the thread context (extended register state, like
> >
> > This looks like a very good idea to me. We really want to make the whole
> > coredump_wait killable.
> 
> Well, it is already killable. 

Except wait_for_completion is not killable and the exiting tasks might
be blocked in a !killable state blocking this one to continue. But...

> And with the change above it can sleep
> in down_write(mmap_sem) and we really need this lock to abort, so it
> won't necessarily react to SIGKILL faster.

you are right that somebody might be holding mmap_sem and we cannot get
rid of it here.

> > I guess this should help us to remove the
> > hackish sig->flags & SIGNAL_GROUP_COREDUMP check from
> > __task_will_free_mem.
> 
> Why? This doesn't depend on "killable". __task_will_free_mem() checks
> this flag to detect the CLONE_VM processes which won't exit soon because
> they participate in the coredumping.

I just (wrongly) assumed that if we make this path killable completely
we can guarantee a forward progress and get rid of SIGNAL_GROUP_COREDUMP
check completely. But you are right this won't be sufficient.

Andrey Ryabinin Nov. 7, 2016, 4:27 p.m. UTC | #6

On 09/30/2016 11:50 AM, Andrey Ryabinin wrote:
> It could be not possible to freeze coredumping task when it waits
> for 'core_state->startup' completion, because threads are frozen
> in get_signal() before they got a chance to complete 'core_state->startup'.
> 
> Use freezer_do_not_count() to tell freezer to ignore coredumping
> task while it waits for core_state->startup completion.
> 
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: stable@vger.kernel.org
> ---

Ping. Can someone apply this please?

>  fs/coredump.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/coredump.c b/fs/coredump.c
> index 281b768..eb9c92c 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -1,6 +1,7 @@
>  #include <linux/slab.h>
>  #include <linux/file.h>
>  #include <linux/fdtable.h>
> +#include <linux/freezer.h>
>  #include <linux/mm.h>
>  #include <linux/stat.h>
>  #include <linux/fcntl.h>
> @@ -423,7 +424,9 @@ static int coredump_wait(int exit_code, struct core_state *core_state)
>  	if (core_waiters > 0) {
>  		struct core_thread *ptr;
>  
> +		freezer_do_not_count();
>  		wait_for_completion(&core_state->startup);
> +		freezer_count();
>  		/*
>  		 * Wait for all the threads to become inactive, so that
>  		 * all the thread context (extended register state, like
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andrew Morton Nov. 7, 2016, 10:26 p.m. UTC | #7

On Fri, 30 Sep 2016 11:50:34 +0300 Andrey Ryabinin <aryabinin@virtuozzo.com> wrote:

> It could be not possible to freeze coredumping task when it waits
> for 'core_state->startup' completion, because threads are frozen
> in get_signal() before they got a chance to complete 'core_state->startup'.
> 
> Use freezer_do_not_count() to tell freezer to ignore coredumping
> task while it waits for core_state->startup completion.
> 
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: stable@vger.kernel.org

The changelog provides no reason why this patch should be merged into
-stable.  Nor into anything else, really.

Please (as always) provide a full description of the bug's end-user
visible effects.

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

coredump: fix unfreezable coredumping task

Commit Message

Comments

Patch