From patchwork Thu Mar 15 11:51:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 10284319 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id CD1A860291 for ; Thu, 15 Mar 2018 11:52:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B298B289B1 for ; Thu, 15 Mar 2018 11:52:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A7382289B7; Thu, 15 Mar 2018 11:52:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_NONE,T_DKIM_INVALID autolearn=no version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 29132289B1 for ; Thu, 15 Mar 2018 11:52:00 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 64A5720956091; Thu, 15 Mar 2018 04:45:36 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: None (no SPF record) identity=mailfrom; client-ip=2001:8b0:10b:1231::1; helo=merlin.infradead.org; envelope-from=peterz@infradead.org; receiver=linux-nvdimm@lists.01.org Received: from merlin.infradead.org (merlin.infradead.org [IPv6:2001:8b0:10b:1231::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id B1296223FCF53 for ; Thu, 15 Mar 2018 04:45:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=HawuSPkRXzMyb7f57MS8zAsKMFZ8KB9+9+bN0yh6tY0=; b=Vc1hI++t7XjOx0S8DCyGsKzBd OIHEvMB3SWHoLI1BkfVtEwGRL9JO8RVD0FRa+QyhC4kM5GB3HP7PjogDoT6pSBzSfTbf+pqFf5ttc ln0C0DY7L00l1YD/i5blpiItSxrB1Y4zLQsazmL8Nsn/ayZIqYZImOcUkCmzq8SVVmzgmxCxNsxsV mRO3RnAWOEkJ/86VkExQwJL4mppcSYlyBtmXvgXtfdUjrzMLQ7yOGI3y2x24w3TiSzuVuVaAkalvH lDyZ9HPBOrsokgHd1n2eI0rzXcSpF3QGS7CJM9ynxZtNkSJ6gxYJ7CTKWR+OmbFnH38BxvDtDVCgb 82DTK2loA==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1ewRQC-0002yV-JH; Thu, 15 Mar 2018 11:51:48 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id EC85C2029B0EE; Thu, 15 Mar 2018 12:51:46 +0100 (CET) Date: Thu, 15 Mar 2018 12:51:46 +0100 From: Peter Zijlstra To: David Howells Subject: Re: [RFC][PATCH] sched/wait_bit: Introduce wait_var_event()/wake_up_var() Message-ID: <20180315115146.GX4064@hirez.programming.kicks-ass.net> References: <20180313102056.GJ4043@hirez.programming.kicks-ass.net> <152066488891.40260.14605734226832760468.stgit@dwillia2-desk3.amr.corp.intel.com> <152066493247.40260.10849841915366086021.stgit@dwillia2-desk3.amr.corp.intel.com> <20180311112725.GC4043@hirez.programming.kicks-ass.net> <3396.1521107922@warthog.procyon.org.uk> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <3396.1521107922@warthog.procyon.org.uk> User-Agent: Mutt/1.9.3 (2018-01-21) X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jan Kara , linux-nvdimm , david , Linux Kernel Mailing List , Ralf Baechle , linux-xfs , Ingo Molnar , linux-fsdevel , Christoph Hellwig Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP On Thu, Mar 15, 2018 at 09:58:42AM +0000, David Howells wrote: > Peter Zijlstra wrote: > > > > > Argh, no no no.. That whole wait_for_atomic_t thing is a giant > > > > trainwreck already and now you're making it worse still. > > Your patch description needs to say why this isn't a trainwreck when you > consider wait_for_atomic_t() to be one since it does things in a very similar > way. Does the below address things sufficiently clear? --- Subject: sched/wait: Introduce wait_var_event() From: Peter Zijlstra Date: Thu Mar 15 11:40:33 CET 2018 As a replacement for the wait_on_atomic_t() API provide the wait_var_event() API. The wait_var_event() API is based on the very same hashed-waitqueue idea, but doesn't care about the type (atomic_t) or the specific condition (atomic_read() == 0). IOW. it's much more widely applicable/flexible. It shares all the benefits/disadvantages of a hashed-waitqueue approach with the existing wait_on_atomic_t/wait_on_bit() APIs. The API is modeled after the existing wait_event() API, but instead of taking a wait_queue_head, it takes an address. This addresses is hashed to obtain a wait_queue_head from the bit_wait_table. Similar to the wait_event() API, it takes a condition expression as second argument and will wait until this expression becomes true. The following are (mostly) identical replacements: wait_on_atomic_t(&my_atomic, atomic_t_wait, TASK_UNINTERRUPTIBLE); wake_up_atomic_t(&my_atomic); wait_var_event(&my_atomic, !atomic_read(&my_atomic)); wake_up_var(&my_atomic); The only difference is that wake_up_var() is an unconditional wakeup and doesn't check the previously hard-coded (atomic_read() == 0) condition here. This is of little concequence, since most callers are already conditional on atomic_dec_and_test() and the ones that are not, are trivial to make so. Cc: David Howells Tested-by: Dan Williams Signed-off-by: Peter Zijlstra (Intel) --- include/linux/wait_bit.h | 70 +++++++++++++++++++++++++++++++++++++++++++++++ kernel/sched/wait_bit.c | 48 ++++++++++++++++++++++++++++++++ 2 files changed, 118 insertions(+) --- a/include/linux/wait_bit.h +++ b/include/linux/wait_bit.h @@ -262,4 +262,74 @@ int wait_on_atomic_t(atomic_t *val, wait return out_of_line_wait_on_atomic_t(val, action, mode); } +extern void init_wait_var_entry(struct wait_bit_queue_entry *wbq_entry, void *var, int flags); +extern void wake_up_var(void *var); +extern wait_queue_head_t *__var_waitqueue(void *p); + +#define ___wait_var_event(var, condition, state, exclusive, ret, cmd) \ +({ \ + __label__ __out; \ + struct wait_queue_head *__wq_head = __var_waitqueue(var); \ + struct wait_bit_queue_entry __wbq_entry; \ + long __ret = ret; /* explicit shadow */ \ + \ + init_wait_var_entry(&__wbq_entry, var, \ + exclusive ? WQ_FLAG_EXCLUSIVE : 0); \ + for (;;) { \ + long __int = prepare_to_wait_event(__wq_head, \ + &__wbq_entry.wq_entry, \ + state); \ + if (condition) \ + break; \ + \ + if (___wait_is_interruptible(state) && __int) { \ + __ret = __int; \ + goto __out; \ + } \ + \ + cmd; \ + } \ + finish_wait(__wq_head, &__wbq_entry.wq_entry); \ +__out: __ret; \ +}) + +#define __wait_var_event(var, condition) \ + ___wait_var_event(var, condition, TASK_UNINTERRUPTIBLE, 0, 0, \ + schedule()) + +#define wait_var_event(var, condition) \ +do { \ + might_sleep(); \ + if (condition) \ + break; \ + __wait_var_event(var, condition); \ +} while (0) + +#define __wait_var_event_killable(var, condition) \ + ___wait_var_event(var, condition, TASK_KILLABLE, 0, 0, \ + schedule()) + +#define wait_var_event_killable(var, condition) \ +({ \ + int __ret = 0; \ + might_sleep(); \ + if (!(condition)) \ + __ret = __wait_var_event_killable(var, condition); \ + __ret; \ +}) + +#define __wait_var_event_timeout(var, condition, timeout) \ + ___wait_var_event(var, ___wait_cond_timeout(condition), \ + TASK_UNINTERRUPTIBLE, 0, timeout, \ + __ret = schedule_timeout(__ret)) + +#define wait_var_event_timeout(var, condition, timeout) \ +({ \ + long __ret = timeout; \ + might_sleep(); \ + if (!___wait_cond_timeout(condition)) \ + __ret = __wait_var_event_timeout(var, condition, timeout); \ + __ret; \ +}) + #endif /* _LINUX_WAIT_BIT_H */ --- a/kernel/sched/wait_bit.c +++ b/kernel/sched/wait_bit.c @@ -149,6 +149,54 @@ void wake_up_bit(void *word, int bit) } EXPORT_SYMBOL(wake_up_bit); +wait_queue_head_t *__var_waitqueue(void *p) +{ + if (BITS_PER_LONG == 64) { + unsigned long q = (unsigned long)p; + + return bit_waitqueue((void *)(q & ~1), q & 1); + } + return bit_waitqueue(p, 0); +} +EXPORT_SYMBOL(__var_waitqueue); + +static int +var_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode, + int sync, void *arg) +{ + struct wait_bit_key *key = arg; + struct wait_bit_queue_entry *wbq_entry = + container_of(wq_entry, struct wait_bit_queue_entry, wq_entry); + + if (wbq_entry->key.flags != key->flags || + wbq_entry->key.bit_nr != key->bit_nr) + return 0; + + return autoremove_wake_function(wq_entry, mode, sync, key); +} + +void init_wait_var_entry(struct wait_bit_queue_entry *wbq_entry, void *var, int flags) +{ + *wbq_entry = (struct wait_bit_queue_entry){ + .key = { + .flags = (var), + .bit_nr = -1, + }, + .wq_entry = { + .private = current, + .func = var_wake_function, + .entry = LIST_HEAD_INIT(wbq_entry->wq_entry.entry), + }, + }; +} +EXPORT_SYMBOL(init_wait_var_entry); + +void wake_up_var(void *var) +{ + __wake_up_bit(__var_waitqueue(var), var, -1); +} +EXPORT_SYMBOL(wake_up_var); + /* * Manipulate the atomic_t address to produce a better bit waitqueue table hash * index (we're keying off bit -1, but that would produce a horrible hash