diff mbox

[6/6] freezer: kill unused set_freezable_with_signal()

Message ID 1314988070-12244-7-git-send-email-tj@kernel.org (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Tejun Heo Sept. 2, 2011, 6:27 p.m. UTC
There's no in-kernel user of set_freezable_with_signal() and there's
no plan to add one.  Mixing TIF_SIGPENDING with kernel threads can
lead to nasty corner cases as kernel threads never travel signal
delivery path on their own.

e.g. the current implementation is buggy in the cancelation path of
__thaw_task().  It calls recalc_sigpending_and_wake() in an attempt to
clear TIF_SIGPENDING but the function never clears it regardless of
sigpending state.  This means that signallable freezable kthreads may
continue executing with !freezing() && stuck TIF_SIGPENDING, which can
be troublesome.

This patch removes set_freezable_with_signal() along with
PF_FREEZER_NOSIG and recalc_sigpending*() calls in freezer.  User
tasks get TIF_SIGPENDING, kernel tasks get woken up and the spurious
sigpending is dealt with in the usual signal delivery path.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/freezer.h |   20 +-------------------
 include/linux/sched.h   |    1 -
 kernel/freezer.c        |   27 ++++++---------------------
 kernel/kthread.c        |    2 +-
 4 files changed, 8 insertions(+), 42 deletions(-)

Comments

Oleg Nesterov Sept. 4, 2011, 6:46 p.m. UTC | #1
On 09/03, Tejun Heo wrote:
>
> This patch removes set_freezable_with_signal() along with
> PF_FREEZER_NOSIG

Great. I never understood PF_FREEZER_NOSIG logic ;)

One question,

> @@ -72,10 +72,6 @@ bool __refrigerator(bool check_kthr_stop)
>  		schedule();
>  	}
>
> -	spin_lock_irq(&current->sighand->siglock);
> -	recalc_sigpending(); /* We sent fake signal, clean it up */
> -	spin_unlock_irq(&current->sighand->siglock);
> -

Why? This recalc_sigpending() makes sense imho. Otherwise the user-space
tasks (almost) always return with TIF_SIGPENDING. May be we can do this
under "if (PF_KTRHREAD)".

For example. Suppose the user-space task does wait_event_freezable()...

Hmm. OTOH, wait_event_freezable() looks wrong anyway... So probably
this doesn't matter. ptrace_stop/get_signal_to_deliver doesn't need
this, probably we do not care about the other callers.

It seems, a lot of get_signal_to_deliver() calles also call
try_to_freeze() for no reason.


So, yes, I am starting to think this change is fine too ;)

Oleg.
Tejun Heo Sept. 5, 2011, 2:33 a.m. UTC | #2
Hello, Oleg.

On Sun, Sep 04, 2011 at 08:46:26PM +0200, Oleg Nesterov wrote:
> > @@ -72,10 +72,6 @@ bool __refrigerator(bool check_kthr_stop)
> >  		schedule();
> >  	}
> >
> > -	spin_lock_irq(&current->sighand->siglock);
> > -	recalc_sigpending(); /* We sent fake signal, clean it up */
> > -	spin_unlock_irq(&current->sighand->siglock);
> > -
> 
> Why? This recalc_sigpending() makes sense imho. Otherwise the user-space
> tasks (almost) always return with TIF_SIGPENDING. May be we can do this
> under "if (PF_KTRHREAD)".

PF_KTHREAD no longer gets TIF_SIGPENDING set, so...

> For example. Suppose the user-space task does wait_event_freezable()...
> 
> Hmm. OTOH, wait_event_freezable() looks wrong anyway... So probably
> this doesn't matter. ptrace_stop/get_signal_to_deliver doesn't need
> this, probably we do not care about the other callers.

Can you elaborate on it being wrong?  Do you mean the possibility of
leaking spurious TIF_SIGPENDING?

> It seems, a lot of get_signal_to_deliver() calles also call
> try_to_freeze() for no reason.

Yeap, it doesn't really matter.  In most cases userland tasks would
get parked in the signal delivery path anyway and if a task is deep in
the kernel, it should and usually does hit syscall restart path.
Except for few special cases (maybe some of unfailable NFS calls),
there'userland tasks shouldn't take try_to_freeze() path outside of
signal delivery / job control path.

Also, in general, I don't think it's a good idea to add non-trivial
optimization like above to code path which is as cold as it gets
without any backing data.  The PM freezer used to busy loop until all
tasks enter refrigerator.  Nobody cares whether some tasks enter
signal delivery path one more time.

> So, yes, I am starting to think this change is fine too ;)

Yeay. :)

Thanks.
Tejun Heo Sept. 5, 2011, 2:35 a.m. UTC | #3
On Mon, Sep 05, 2011 at 11:33:15AM +0900, Tejun Heo wrote:
> > So, yes, I am starting to think this change is fine too ;)
> 
> Yeay. :)

Ooh, can I add your Acked-by before sending pull request to Rafael?

Thanks.
Oleg Nesterov Sept. 5, 2011, 4:20 p.m. UTC | #4
On 09/05, Tejun Heo wrote:
>
> Hello, Oleg.
>
> On Sun, Sep 04, 2011 at 08:46:26PM +0200, Oleg Nesterov wrote:
> > > @@ -72,10 +72,6 @@ bool __refrigerator(bool check_kthr_stop)
> > >  		schedule();
> > >  	}
> > >
> > > -	spin_lock_irq(&current->sighand->siglock);
> > > -	recalc_sigpending(); /* We sent fake signal, clean it up */
> > > -	spin_unlock_irq(&current->sighand->siglock);
> > > -
> >
> > Why? This recalc_sigpending() makes sense imho. Otherwise the user-space
> > tasks (almost) always return with TIF_SIGPENDING. May be we can do this
> > under "if (PF_KTRHREAD)".
>
> PF_KTHREAD no longer gets TIF_SIGPENDING set, so...

Yes,

> > For example. Suppose the user-space task does wait_event_freezable()...
> >
> > Hmm. OTOH, wait_event_freezable() looks wrong anyway... So probably
> > this doesn't matter. ptrace_stop/get_signal_to_deliver doesn't need
> > this, probably we do not care about the other callers.
>
> Can you elaborate on it being wrong?  Do you mean the possibility of
> leaking spurious TIF_SIGPENDING?

Perhaps it is correct... Just I do not understand what it should do.
I thought it is "wait_for_event && do_not_block_freezer". And at first
glance the code looks as if it tries to do this. Say, in the "likely"
case we restart wait_event_interruptible() after refrigerator().

But this looks racy. Suppose that freezing() is already false when
try_to_freeze() or __refrigerator() is called. Say, cgroup_freezer does
freeze_task() + __thaw_task(). Why it returns -ERESTARTSYS in this case?

And if it can be used by the userspace thread, then we should probably
do recalc_sigpending() somewhere, otherwise wait_event_freezable() will
always return -ERESTARTSYS after __refrigerator().

Oleg.
Oleg Nesterov Sept. 5, 2011, 4:21 p.m. UTC | #5
On 09/05, Tejun Heo wrote:
>
> Ooh, can I add your Acked-by before sending pull request to Rafael?

Yes, thanks, please feel free.

Oleg.
Tejun Heo Sept. 6, 2011, 3:28 a.m. UTC | #6
Hello,

On Mon, Sep 05, 2011 at 06:20:12PM +0200, Oleg Nesterov wrote:
> Perhaps it is correct... Just I do not understand what it should do.
> I thought it is "wait_for_event && do_not_block_freezer". And at first
> glance the code looks as if it tries to do this. Say, in the "likely"
> case we restart wait_event_interruptible() after refrigerator().
> 
> But this looks racy. Suppose that freezing() is already false when
> try_to_freeze() or __refrigerator() is called. Say, cgroup_freezer does
> freeze_task() + __thaw_task(). Why it returns -ERESTARTSYS in this case?

It may return -ERESTARTSYS when not strictly necessary but given that
it's supposed to trigger restart anyway I don't think it's actually
broken (it doesn't break the contract of the wait).  Another thing to
note is that kthread freezing via cgroup is almost fundamentally
broken.  With the removal of freezable_with_signal, this shouldn't be
an issue anymore although cgroup_freezer still needs to be fixed to
account for kthreads for xstate transitions (it currently triggers
BUG_ON).

> And if it can be used by the userspace thread, then we should probably
> do recalc_sigpending() somewhere, otherwise wait_event_freezable() will
> always return -ERESTARTSYS after __refrigerator().

Thankfully, currently, all the few users are kthreads.  I don't think
it makes sense for userland tasks at all.  Interruptible sleeps for
userland tasks necessiate the ability to return to signal delivery
path and restart or abort the current operation.  There's no point in
using wait_event_freezable() instead of wait_event_interruptible().

Thanks.
Oleg Nesterov Sept. 6, 2011, 3:18 p.m. UTC | #7
Hi,

On 09/06, Tejun Heo wrote:
>
> Hello,
>
> On Mon, Sep 05, 2011 at 06:20:12PM +0200, Oleg Nesterov wrote:
> > Perhaps it is correct... Just I do not understand what it should do.
> > I thought it is "wait_for_event && do_not_block_freezer". And at first
> > glance the code looks as if it tries to do this. Say, in the "likely"
> > case we restart wait_event_interruptible() after refrigerator().
> >
> > But this looks racy. Suppose that freezing() is already false when
> > try_to_freeze() or __refrigerator() is called. Say, cgroup_freezer does
> > freeze_task() + __thaw_task(). Why it returns -ERESTARTSYS in this case?
>
> It may return -ERESTARTSYS when not strictly necessary but given that
> it's supposed to trigger restart anyway I don't think it's actually
> broken (it doesn't break the contract of the wait).

OK, but still this doesn't look right.

> Another thing to
> note is that kthread freezing via cgroup is almost fundamentally
> broken.

OK, let it be freeze_processes()+thaw_tasks(). Of course this is mostly
theoretical.

> > And if it can be used by the userspace thread, then we should probably
> > do recalc_sigpending() somewhere, otherwise wait_event_freezable() will
> > always return -ERESTARTSYS after __refrigerator().
>
> Thankfully, currently, all the few users are kthreads.  I don't think
> it makes sense for userland tasks at all.

Yes, agreed. In this case I think it should be

	#define wait_event_freezable(wq, condition)				\
	({									\
		int __retval;							\
		for (;;) {							\
			__retval = wait_event_interruptible(wq, 		\
					(condition) || freezing(current));	\
			if (__retval || (condition))				\
				break;						\
			try_to_freeze();					\
		}								\
		__retval;							\
	})

__retval/ERESTARTSYS is only needed for kthreads which play with allow_signal(),
probably nobody should do this.

Oleg.
Oleg Nesterov Sept. 6, 2011, 3:25 p.m. UTC | #8
On 09/06, Oleg Nesterov wrote:
>
> Yes, agreed. In this case I think it should be
>
> 	#define wait_event_freezable(wq, condition)				\
> 	({									\
> 		int __retval;							\
> 		for (;;) {							\
> 			__retval = wait_event_interruptible(wq, 		\
> 					(condition) || freezing(current));	\
> 			if (__retval || (condition))				\
> 				break;						\
> 			try_to_freeze();					\
> 		}								\
> 		__retval;							\
> 	})
>
> __retval/ERESTARTSYS is only needed for kthreads which play with allow_signal(),
> probably nobody should do this.

I meant, unless the caller plays with allow_signal(), it has all rights to do

	if (wait_event_freezable(...))
		BUG();

This becomes correct with the code above.

Oleg.
Tejun Heo Sept. 6, 2011, 3:53 p.m. UTC | #9
Hello,

On Tue, Sep 06, 2011 at 05:25:39PM +0200, Oleg Nesterov wrote:
> On 09/06, Oleg Nesterov wrote:
> >
> > Yes, agreed. In this case I think it should be
> >
> > 	#define wait_event_freezable(wq, condition)				\
> > 	({									\
> > 		int __retval;							\
> > 		for (;;) {							\
> > 			__retval = wait_event_interruptible(wq, 		\
> > 					(condition) || freezing(current));	\
> > 			if (__retval || (condition))				\
> > 				break;						\
> > 			try_to_freeze();					\
> > 		}								\
> > 		__retval;							\
> > 	})
> >
> > __retval/ERESTARTSYS is only needed for kthreads which play with allow_signal(),
> > probably nobody should do this.
> 
> I meant, unless the caller plays with allow_signal(), it has all rights to do
> 
> 	if (wait_event_freezable(...))
> 		BUG();
> 
> This becomes correct with the code above.

Yeap, sure, w/ freezable_with_signal gone, the above should work fine.
Care to create a patch?

Thanks.
Oleg Nesterov Sept. 7, 2011, 6:21 p.m. UTC | #10
Hi,

On 09/07, Tejun Heo wrote:
> >
> > I meant, unless the caller plays with allow_signal(), it has all rights to do
> >
> > 	if (wait_event_freezable(...))
> > 		BUG();
> >
> > This becomes correct with the code above.
>
> Yeap, sure, w/ freezable_with_signal gone, the above should work fine.
> Care to create a patch?

Sure.

But. Tejun, Rafael, I have to rely on your review. I have no idea how
can I test this change. Hopfully it is simple enough, though.

And. wait_event_freezable_timeout() obviously has another problem,
although I hope this is fine. The caller can spend a lot of time in
refrigerator, shouldn't we update "timeout" in this case? I hope we
shouldn't.

Oleg.
diff mbox

Patch

diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index 9193bae..7ffbf04 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -48,7 +48,7 @@  static inline bool try_to_freeze(void)
 }
 
 extern bool freeze_task(struct task_struct *p);
-extern bool __set_freezable(bool with_signal);
+extern bool set_freezable(void);
 
 #ifdef CONFIG_CGROUP_FREEZER
 extern bool cgroup_freezing(struct task_struct *task);
@@ -104,23 +104,6 @@  static inline int freezer_should_skip(struct task_struct *p)
 }
 
 /*
- * Tell the freezer that the current task should be frozen by it
- */
-static inline bool set_freezable(void)
-{
-	return __set_freezable(false);
-}
-
-/*
- * Tell the freezer that the current task should be frozen by it and that it
- * should send a fake signal to the task to freeze it.
- */
-static inline bool set_freezable_with_signal(void)
-{
-	return __set_freezable(true);
-}
-
-/*
  * Freezer-friendly wrappers around wait_event_interruptible() and
  * wait_event_interruptible_timeout(), originally defined in <linux/wait.h>
  */
@@ -164,7 +147,6 @@  static inline void freezer_do_not_count(void) {}
 static inline void freezer_count(void) {}
 static inline int freezer_should_skip(struct task_struct *p) { return 0; }
 static inline void set_freezable(void) {}
-static inline void set_freezable_with_signal(void) {}
 
 #define wait_event_freezable(wq, condition)				\
 		wait_event_interruptible(wq, condition)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 1bb3356..6d45ce7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1784,7 +1784,6 @@  extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 #define PF_MEMPOLICY	0x10000000	/* Non-default NUMA mempolicy */
 #define PF_MUTEX_TESTER	0x20000000	/* Thread belongs to the rt mutex tester */
 #define PF_FREEZER_SKIP	0x40000000	/* Freezer should not count it as freezable */
-#define PF_FREEZER_NOSIG 0x80000000	/* Freezer won't send signals to it */
 
 /*
  * Only the _current_ task can read/write to tsk->flags, but other
diff --git a/kernel/freezer.c b/kernel/freezer.c
index da76b64..770e6f5 100644
--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -39,7 +39,7 @@  bool freezing_slow_path(struct task_struct *p)
 	if (pm_nosig_freezing || cgroup_freezing(p))
 		return true;
 
-	if (pm_freezing && !(p->flags & PF_FREEZER_NOSIG))
+	if (pm_freezing && !(p->flags & PF_KTHREAD))
 		return true;
 
 	return false;
@@ -72,10 +72,6 @@  bool __refrigerator(bool check_kthr_stop)
 		schedule();
 	}
 
-	spin_lock_irq(&current->sighand->siglock);
-	recalc_sigpending(); /* We sent fake signal, clean it up */
-	spin_unlock_irq(&current->sighand->siglock);
-
 	pr_debug("%s left refrigerator\n", current->comm);
 
 	/*
@@ -120,7 +116,7 @@  bool freeze_task(struct task_struct *p)
 		return false;
 	}
 
-	if (!(p->flags & PF_FREEZER_NOSIG)) {
+	if (!(p->flags & PF_KTHREAD)) {
 		fake_signal_wake_up(p);
 		/*
 		 * fake_signal_wake_up() goes through p's scheduler
@@ -145,28 +141,19 @@  void __thaw_task(struct task_struct *p)
 	 * be visible to @p as waking up implies wmb.  Waking up inside
 	 * freezer_lock also prevents wakeups from leaking outside
 	 * refrigerator.
-	 *
-	 * If !FROZEN, @p hasn't reached refrigerator, recalc sigpending to
-	 * avoid leaving dangling TIF_SIGPENDING behind.
 	 */
 	spin_lock_irqsave(&freezer_lock, flags);
-	if (frozen(p)) {
+	if (frozen(p))
 		wake_up_process(p);
-	} else {
-		spin_lock(&p->sighand->siglock);
-		recalc_sigpending_and_wake(p);
-		spin_unlock(&p->sighand->siglock);
-	}
 	spin_unlock_irqrestore(&freezer_lock, flags);
 }
 
 /**
- * __set_freezable - make %current freezable
- * @with_signal: do we want %TIF_SIGPENDING for notification too?
+ * set_freezable - make %current freezable
  *
  * Mark %current freezable and enter refrigerator if necessary.
  */
-bool __set_freezable(bool with_signal)
+bool set_freezable(void)
 {
 	might_sleep();
 
@@ -177,10 +164,8 @@  bool __set_freezable(bool with_signal)
 	 */
 	spin_lock_irq(&freezer_lock);
 	current->flags &= ~PF_NOFREEZE;
-	if (with_signal)
-		current->flags &= ~PF_FREEZER_NOSIG;
 	spin_unlock_irq(&freezer_lock);
 
 	return try_to_freeze();
 }
-EXPORT_SYMBOL(__set_freezable);
+EXPORT_SYMBOL(set_freezable);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index a6cbeea..7a4c862 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -282,7 +282,7 @@  int kthreadd(void *unused)
 	set_cpus_allowed_ptr(tsk, cpu_all_mask);
 	set_mems_allowed(node_states[N_HIGH_MEMORY]);
 
-	current->flags |= PF_NOFREEZE | PF_FREEZER_NOSIG;
+	current->flags |= PF_NOFREEZE;
 
 	for (;;) {
 		set_current_state(TASK_INTERRUPTIBLE);