From patchwork Wed Aug 2 23:23:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 9889347 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id EF23D601EB for ; Wed, 9 Aug 2017 00:10:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E03E528908 for ; Wed, 9 Aug 2017 00:10:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D48C928938; Wed, 9 Aug 2017 00:10:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.5 required=2.0 tests=BAYES_00, DATE_IN_PAST_96_XX, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 51C2B28908 for ; Wed, 9 Aug 2017 00:10:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752430AbdHIAKj (ORCPT ); Tue, 8 Aug 2017 20:10:39 -0400 Received: from mx2.suse.de ([195.135.220.15]:52751 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752081AbdHIAKj (ORCPT ); Tue, 8 Aug 2017 20:10:39 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 2C6E1AC54 for ; Wed, 9 Aug 2017 00:10:38 +0000 (UTC) Date: Thu, 3 Aug 2017 01:23:31 +0200 From: "Luis R. Rodriguez" To: Kees Cook , Matt Redfearn Cc: "Luis R. Rodriguez" , Alexander Viro , Andrew Morton , David Howells , Dmitry Torokhov , Dan Carpenter , Jessica Yu , Michal Marek , Linus Torvalds , Greg Kroah-Hartman , Linux MIPS Mailing List , ebiederm@xmission.com, mingo@redhat.com, peterz@infradead.org, Petr Mladek , "linux-fsdevel@vger.kernel.org" , LKML Subject: Re: [RFC PATCH] exec: Avoid recursive modprobe for binary format handlers Message-ID: <20170802232331.GO18884@wotan.suse.de> References: <1500645920-28490-1-git-send-email-matt.redfearn@imgtec.com> <20170802001200.GD18884@wotan.suse.de> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.0 (2016-04-01) Sender: owner-linux-modules@vger.kernel.org Precedence: bulk List-ID: X-Virus-Scanned: ClamAV using ClamSMTP On Tue, Aug 01, 2017 at 07:28:20PM -0700, Kees Cook wrote: > On Tue, Aug 1, 2017 at 5:12 PM, Luis R. Rodriguez wrote: > > On Fri, Jul 21, 2017 at 03:05:20PM +0100, Matt Redfearn wrote: > >> Commit 6d7964a722af ("kmod: throttle kmod thread limit") which was > >> merged in v4.13-rc1 broke this behaviour since the recursive modprobe is > >> no longer caught, it just ends up waiting indefinitely for the kmod_wq > >> wait queue. Hence the kernel appears to hang silently when starting > >> userspace. > > > > Indeed, the recursive issue were no longer expected to exist. > > Errr, yeah, recursive binfmt loads can still happen. > > > The *old* implementation would also prevent a set of binaries to daisy chain > > a set of 50 different binaries which require different binfmt loaders. The > > current implementation enables this and we'd just wait. There's a bound to > > the number of binfmd loaders though, so this would be bounded. If however > > a 2nd loader loaded the first binary we'd run into the same issue I think. > > > > If we can't think of a good way to resolve this we'll just have to revert > > 6d7964a722af for now. > > The weird but "normal" recursive case is usually a script calling a > script calling a misc format. Getting a chain of modprobes running, > though, seems unlikely. I *think* Matt's patch is okay, but I agree, > it'd be better for the request_module() to fail. In that case how about we just have each waiter only wait max X seconds, if the number of concurrent ongoing modprobe calls hasn't reduced by a single digit in X seconds we give up on request_module() for the module and clearly indicate what happened. Matt, can you test? Note I've used wait_event_killable_timeout() to only accept SIGKILL for now. I've seen issues wit SIGCHILD and at modprobe this could even be a bigger issue, so this would restrict the signals received *only* to SIGKILL. It would be good to come up with a simple test case for this in tools/testing/selftests/kmod/kmod.sh Luis --- To unsubscribe from this list: send the line "unsubscribe linux-modules" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/include/linux/wait.h b/include/linux/wait.h index 5b74e36c0ca8..dc19880c02f5 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -757,6 +757,43 @@ extern int do_wait_intr_irq(wait_queue_head_t *, wait_queue_entry_t *); __ret; \ }) +#define __wait_event_killable_timeout(wq_head, condition, timeout) \ + ___wait_event(wq_head, ___wait_cond_timeout(condition), \ + TASK_KILLABLE, 0, timeout, \ + __ret = schedule_timeout(__ret)) + +/** + * wait_event_killable_timeout - sleep until a condition gets true or a timeout elapses + * @wq_head: the waitqueue to wait on + * @condition: a C expression for the event to wait for + * @timeout: timeout, in jiffies + * + * The process is put to sleep (TASK_KILLABLE) until the + * @condition evaluates to true or a kill signal is received. + * The @condition is checked each time the waitqueue @wq_head is woken up. + * + * wake_up() has to be called after changing any variable that could + * change the result of the wait condition. + * + * Returns: + * 0 if the @condition evaluated to %false after the @timeout elapsed, + * 1 if the @condition evaluated to %true after the @timeout elapsed, + * the remaining jiffies (at least 1) if the @condition evaluated + * to %true before the @timeout elapsed, or -%ERESTARTSYS if it was + * interrupted by a kill signal. + * + * Only kill signals interrupt this process. + */ +#define wait_event_killable_timeout(wq_head, condition, timeout) \ +({ \ + long __ret = timeout; \ + might_sleep(); \ + if (!___wait_cond_timeout(condition)) \ + __ret = __wait_event_killable_timeout(wq_head, \ + condition, timeout); \ + __ret; \ +}) + #define __wait_event_lock_irq(wq_head, condition, lock, cmd) \ (void)___wait_event(wq_head, condition, TASK_UNINTERRUPTIBLE, 0, 0, \ diff --git a/kernel/kmod.c b/kernel/kmod.c index 6d016c5d97c8..1b5f7bada8d2 100644 --- a/kernel/kmod.c +++ b/kernel/kmod.c @@ -71,6 +71,13 @@ static atomic_t kmod_concurrent_max = ATOMIC_INIT(MAX_KMOD_CONCURRENT); static DECLARE_WAIT_QUEUE_HEAD(kmod_wq); /* + * If modprobe can't be called after this time we assume its very likely + * your userspace has created a recursive dependency, and we'll have no + * option but to fail. + */ +#define MAX_KMOD_TIMEOUT 5 + +/* modprobe_path is set via /proc/sys. */ char modprobe_path[KMOD_PATH_LEN] = "/sbin/modprobe"; @@ -167,8 +174,18 @@ int __request_module(bool wait, const char *fmt, ...) pr_warn_ratelimited("request_module: kmod_concurrent_max (%u) close to 0 (max_modprobes: %u), for module %s, throttling...", atomic_read(&kmod_concurrent_max), MAX_KMOD_CONCURRENT, module_name); - wait_event_interruptible(kmod_wq, - atomic_dec_if_positive(&kmod_concurrent_max) >= 0); + ret = wait_event_killable_timeout(kmod_wq, + atomic_dec_if_positive(&kmod_concurrent_max) >= 0, + MAX_KMOD_TIMEOUT * HZ); + if (!ret) { + pr_warn_ratelimited("request_module: modprobe %s cannot be processed, kmod busy with %d threads for more than %d seconds now", + module_name, atomic_read(&kmod_concurrent_max), MAX_KMOD_TIMEOUT); + pr_warn_ratelimited("request_module: recursive modprobe call very likely!"); + return -ETIME; + } else if (ret == -ERESTARTSYS) { + pr_warn_ratelimited("request_module: sigkill sent for modprobe %s, giving up", module_name); + return ret; + } } trace_module_request(module_name, wait, _RET_IP_);