Message ID | 1433516477-5153-12-git-send-email-pmladek@suse.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 06/05, Petr Mladek wrote: > > [*] In fact, there was a bug in the original code. It tried to process > a non-existing signal when the system was freezing. See the common > check for pending signal and freezing. And another bug afaics: > - case SIGSTOP: > - jffs2_dbg(1, "%s(): SIGSTOP received\n", > - __func__); > - set_current_state(TASK_STOPPED); > - schedule(); > - break; This is obviously racy, we can miss SIGCONT. Still I personally dislike the new kthread_sigaction() API. I agree, a couple if signal helpers for kthreads make sense. Say, void kthread_do_signal_stop(void) { spin_lock_irq(&curtent->sighand->siglock); if (current->jobctl & JOBCTL_STOP_DEQUEUED) __set_current_state(TASK_STOPPED); spin_unlock_irq(¤t->sighand->siglock); schedule(); } and probably even "int kthread_signal_deque(void)". But personally I do not think kthread_do_signal() makes a lot of sense... Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 6 Jun 2015, Oleg Nesterov wrote: > Still I personally dislike the new kthread_sigaction() API. I agree, > a couple if signal helpers for kthreads make sense. Say, > > void kthread_do_signal_stop(void) > { > spin_lock_irq(&curtent->sighand->siglock); > if (current->jobctl & JOBCTL_STOP_DEQUEUED) > __set_current_state(TASK_STOPPED); > spin_unlock_irq(¤t->sighand->siglock); > > schedule(); > } ... not to mention the fact that 'STOP' keyword in relation to kthreads has completely different meaning today, which just contributes to overall confusion; but that's an independent story. > > and probably even "int kthread_signal_deque(void)". > > But personally I do not think kthread_do_signal() makes a lot of sense... Would it be possible for you to elaborate a little bit more why you think so ... ? I personally don't see a huge principal difference between "kthread_signal_dequeue() + kthread_do_signal_{stop,...}" vs. generic "kthread_do_signal()" that's just basically completely general and takes care of 'everything necessary'. That being said, my relationship to signal handling code is of course much less intimate compared to yours, so I am really curious what particular objections to that interface have. Thanks a lot,
On 06/06, Jiri Kosina wrote: > > On Sat, 6 Jun 2015, Oleg Nesterov wrote: > > > Still I personally dislike the new kthread_sigaction() API. I agree, > > a couple if signal helpers for kthreads make sense. Say, > > > > void kthread_do_signal_stop(void) > > { > > spin_lock_irq(&curtent->sighand->siglock); > > if (current->jobctl & JOBCTL_STOP_DEQUEUED) > > __set_current_state(TASK_STOPPED); > > spin_unlock_irq(¤t->sighand->siglock); > > > > schedule(); > > } > > ... not to mention the fact that 'STOP' keyword in relation to kthreads > has completely different meaning today, which just contributes to overall > confusion; but that's an independent story. Yes, agreed. > > But personally I do not think kthread_do_signal() makes a lot of sense... > > Would it be possible for you to elaborate a little bit more why you think > so ... ? Please see another email I sent in reply to 06/18. > I personally don't see a huge principal difference between > "kthread_signal_dequeue() + kthread_do_signal_{stop,...}" vs. generic > "kthread_do_signal()" that's just basically completely general and takes > care of 'everything necessary'. Then why do we need the new API ? And I do see the difference. Rightly or not I belive that this API buys nothing but makes the kthread && signal interaction more complex and confusing. For no reason. But! > That being said, my relationship to signal > handling code is of course much less intimate compared to yours, No, no, no, this doesn't matter at all ;) Yes I do dislike this API. So what? I can be wrong. So if other reviewers like it I will hate them all ^W^W^W not argure. So please comment. I never trust myself unless I can technically (try to) prove I am right. In this case I can't, this is only my feeling. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 7 Jun 2015, Oleg Nesterov wrote: > > > Still I personally dislike the new kthread_sigaction() API. I agree, > > > a couple if signal helpers for kthreads make sense. Say, > > > > > > void kthread_do_signal_stop(void) > > > { > > > spin_lock_irq(&curtent->sighand->siglock); > > > if (current->jobctl & JOBCTL_STOP_DEQUEUED) > > > __set_current_state(TASK_STOPPED); > > > spin_unlock_irq(¤t->sighand->siglock); > > > > > > schedule(); > > > } > > > > ... not to mention the fact that 'STOP' keyword in relation to kthreads > > has completely different meaning today, which just contributes to overall > > confusion; but that's an independent story. > > Yes, agreed. I BTW think this really needs to be fixed. We have kthread parking and kthread stopping at least (and that doesn't include kthreads that actually *do* handle SIGSTOP), and the naming is unfortunate and confusing. > > > But personally I do not think kthread_do_signal() makes a lot of sense... > > > > Would it be possible for you to elaborate a little bit more why you think > > so ... ? > > Please see another email I sent in reply to 06/18. Yeah, let's just continue more detailed discussion there. I saw your reply only after I answered to your original mail. > > I personally don't see a huge principal difference between > > "kthread_signal_dequeue() + kthread_do_signal_{stop,...}" vs. generic > > "kthread_do_signal()" that's just basically completely general and > > takes care of 'everything necessary'. > > Then why do we need the new API ? Well, in a nutshell, because of the "it's general and takes care of everything" part. > And I do see the difference. Rightly or not I belive that this API buys > nothing but makes the kthread && signal interaction more complex and > confusing. For no reason. Current situation with kthrads is a mess. Everyone and his grand-son needs to put explicit try_to_freeze(), cond_resched() and whatever else checks in his private kthreads main loop. The fact that kthreads are more and more prone to making use of signal handling makes this even more complicated, because everyone is reinventing his own wheel for this. It can't be really properly reviewed and - more importantly - it makes the whole "how would this particular kthread react to this particular signal?" very unpredictable and hard to answer question. IMO Petr's patchset basically brings some kind order to this all -- it "batches" the kthread executions to individual iterations of the main loop, and allows to perform actions on the border of the iteration (such as, but not limited to, handling of the signals). Signal handling is just one of the piggy-backers on top of this general cleanup. Thanks,
On 06/07, Jiri Kosina wrote: > > On Sun, 7 Jun 2015, Oleg Nesterov wrote: > > > > I personally don't see a huge principal difference between > > > "kthread_signal_dequeue() + kthread_do_signal_{stop,...}" vs. generic > > > "kthread_do_signal()" that's just basically completely general and > > > takes care of 'everything necessary'. > > > > Then why do we need the new API ? > > Well, in a nutshell, because of the "it's general and takes care of > everything" part. ... > Signal handling is just > one of the piggy-backers on top of this general cleanup. And to avoid the confusion: so far I only argued with the signal handling part of this API. Namely with kthread_do_signal(), especially with the SIG_DFL logic. If we want somthing like kthread_iterant agree it should probably help to handle the signals too. But afaics kthread_do_signal() doesn't really help and certainly it is not strictly necessary. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/jffs2/background.c b/fs/jffs2/background.c index 6af076b8f60f..50c16048ba2d 100644 --- a/fs/jffs2/background.c +++ b/fs/jffs2/background.c @@ -21,99 +21,86 @@ #include <linux/kthread.h> #include "nodelist.h" -static int jffs2_garbage_collect_thread(void *_c) +static void jffs2_garbage_collect_thread_sighup(int sig) +{ + jffs2_dbg(1, "%s(): SIGHUP received\n", __func__); +} + +static void jffs2_garbage_collect_thread_sigkill(int sig) +{ + jffs2_dbg(1, "%s(): SIGKILL received\n", __func__); + kthread_stop_current(); +} + +static void jffs2_garbage_collect_thread_init(void *_c) { struct jffs2_sb_info *c = _c; - sigset_t hupmask; - siginitset(&hupmask, sigmask(SIGHUP)); - allow_signal(SIGKILL); - allow_signal(SIGSTOP); - allow_signal(SIGCONT); - allow_signal(SIGHUP); + kthread_sigaction(SIGKILL, jffs2_garbage_collect_thread_sigkill); + kthread_sigaction(SIGHUP, jffs2_garbage_collect_thread_sighup); + kthread_sigaction(SIGSTOP, KTHREAD_SIG_DFL); c->gc_task = current; complete(&c->gc_thread_start); set_user_nice(current, 10); +}; + +static void jffs2_garbage_collect_thread_func(void *_c) +{ + struct jffs2_sb_info *c = _c; + sigset_t hupmask; + + siginitset(&hupmask, sigmask(SIGHUP)); - set_freezable(); - for (;;) { - sigprocmask(SIG_UNBLOCK, &hupmask, NULL); - again: - spin_lock(&c->erase_completion_lock); - if (!jffs2_thread_should_wake(c)) { - set_current_state (TASK_INTERRUPTIBLE); - spin_unlock(&c->erase_completion_lock); - jffs2_dbg(1, "%s(): sleeping...\n", __func__); - schedule(); - } else { - spin_unlock(&c->erase_completion_lock); - } - /* Problem - immediately after bootup, the GCD spends a lot - * of time in places like jffs2_kill_fragtree(); so much so - * that userspace processes (like gdm and X) are starved - * despite plenty of cond_resched()s and renicing. Yield() - * doesn't help, either (presumably because userspace and GCD - * are generally competing for a higher latency resource - - * disk). - * This forces the GCD to slow the hell down. Pulling an - * inode in with read_inode() is much preferable to having - * the GC thread get there first. */ - schedule_timeout_interruptible(msecs_to_jiffies(50)); - - if (kthread_should_stop()) { - jffs2_dbg(1, "%s(): kthread_stop() called\n", __func__); - goto die; - } - - /* Put_super will send a SIGKILL and then wait on the sem. - */ - while (signal_pending(current) || freezing(current)) { - siginfo_t info; - unsigned long signr; - - if (try_to_freeze()) - goto again; - - signr = dequeue_signal_lock(current, ¤t->blocked, &info); - - switch(signr) { - case SIGSTOP: - jffs2_dbg(1, "%s(): SIGSTOP received\n", - __func__); - set_current_state(TASK_STOPPED); - schedule(); - break; - - case SIGKILL: - jffs2_dbg(1, "%s(): SIGKILL received\n", - __func__); - goto die; - - case SIGHUP: - jffs2_dbg(1, "%s(): SIGHUP received\n", - __func__); - break; - default: - jffs2_dbg(1, "%s(): signal %ld received\n", - __func__, signr); - } - } - /* We don't want SIGHUP to interrupt us. STOP and KILL are OK though. */ - sigprocmask(SIG_BLOCK, &hupmask, NULL); - - jffs2_dbg(1, "%s(): pass\n", __func__); - if (jffs2_garbage_collect_pass(c) == -ENOSPC) { - pr_notice("No space for garbage collection. Aborting GC thread\n"); - goto die; - } + spin_lock(&c->erase_completion_lock); + if (!jffs2_thread_should_wake(c)) { + set_kthread_iterant_int_sleep(); + spin_unlock(&c->erase_completion_lock); + jffs2_dbg(1, "%s(): sleeping...\n", __func__); + return; + } + spin_unlock(&c->erase_completion_lock); + + /* Problem - immediately after bootup, the GCD spends a lot + * of time in places like jffs2_kill_fragtree(); so much so + * that userspace processes (like gdm and X) are starved + * despite plenty of cond_resched()s and renicing. Yield() + * doesn't help, either (presumably because userspace and GCD + * are generally competing for a higher latency resource - + * disk). + * This forces the GCD to slow the hell down. Pulling an + * inode in with read_inode() is much preferable to having + * the GC thread get there first. + */ + schedule_timeout_interruptible(msecs_to_jiffies(50)); + + if (kthread_should_stop()) { + jffs2_dbg(1, "%s(): kthread_stop() called\n", __func__); + return; } - die: + + try_to_freeze(); + + /* We don't want SIGHUP to interrupt us. STOP and KILL are OK though. */ + sigprocmask(SIG_BLOCK, &hupmask, NULL); + + jffs2_dbg(1, "%s(): pass\n", __func__); + if (jffs2_garbage_collect_pass(c) == -ENOSPC) { + pr_notice("No space for garbage collection. Aborting GC thread\n"); + kthread_stop_current(); + } + sigprocmask(SIG_UNBLOCK, &hupmask, NULL); +} + +static void jffs2_garbage_collect_thread_destroy(void *_c) +{ + struct jffs2_sb_info *c = _c; + spin_lock(&c->erase_completion_lock); c->gc_task = NULL; spin_unlock(&c->erase_completion_lock); - complete_and_exit(&c->gc_thread_exit, 0); + complete(&c->gc_thread_exit); } void jffs2_garbage_collect_trigger(struct jffs2_sb_info *c) @@ -126,16 +113,21 @@ void jffs2_garbage_collect_trigger(struct jffs2_sb_info *c) /* This must only ever be called when no GC thread is currently running */ int jffs2_start_garbage_collect_thread(struct jffs2_sb_info *c) { + static struct kthread_iterant kti = { + .init = jffs2_garbage_collect_thread_init, + .func = jffs2_garbage_collect_thread_func, + .destroy = jffs2_garbage_collect_thread_destroy, + }; struct task_struct *tsk; int ret = 0; BUG_ON(c->gc_task); + kti.data = c; init_completion(&c->gc_thread_start); init_completion(&c->gc_thread_exit); - tsk = kthread_run(jffs2_garbage_collect_thread, c, - "jffs2_gcd_mtd%d", c->mtd->index); + tsk = kthread_iterant_run(&kti, "jffs2_gcd_mtd%d", c->mtd->index); if (IS_ERR(tsk)) { pr_warn("fork failed for JFFS2 garbage collect thread: %ld\n", -PTR_ERR(tsk));
The new iterant kthread API allows to define a common checkpoint for freezing, parking, termination, and even signal handling. It will allow to maintain kthreads more easily and make the operations more reliable[*]. The kthread function is split into optional init(), func(), destroy() parts where func() is called in a cycle. The common check point is after each func() finishes. See kthread_iterant_fn() for more details. func() does not need to call schedule(). Instead, we call set_kthread_iterant_int_sleep() and let the generic code to handle the sleeping properly. There are few small functional changes: + SIGSTOP uses the standard handler and could not print debug messages any longer + there is no need to enable or define sighandler for SIGCONT; it is handled by prepare_signal() and could not get disabled + the debug message for default signal handler was removed; It is handled in kernel_do_signal() a standard way. In reality, this never happens because all signals are disabled except for the four ones enabled by kthread_sigaction(); + we call complete() instead of complete_and_exit(); do_exit() is called in kthread() anyway after the function returns. [*] In fact, there was a bug in the original code. It tried to process a non-existing signal when the system was freezing. See the common check for pending signal and freezing. Signed-off-by: Petr Mladek <pmladek@suse.cz> --- fs/jffs2/background.c | 156 ++++++++++++++++++++++++-------------------------- 1 file changed, 74 insertions(+), 82 deletions(-)